1 / 38

Scaling and Fault Tolerance for Distributed Messages in a Service and Streaming Architecture

This research focuses on developing a messaging middleware-based integrated system that allows access to streams from different worlds, with a focus on session management, metadata description and management, multiple streams issues, and application scenarios.

Télécharger la présentation

Scaling and Fault Tolerance for Distributed Messages in a Service and Streaming Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scaling and Fault Tolerance for Distributed Messages in a Service and Streaming Architecture Hasan Bulut hbulut@cs.indiana.edu

  2. Motivation • Videoconferencing systems; AccessGrid, VRVS, H.323 based systems (i.e. Polycom) • Streaming media; client-server architecture. Client has control over the stream. • Annotation systems; use streams in local files or obtained from capture device directly • We would like to build an integrated system based on messaging middleware where streams from each world can be accessed from another one. This will also extend and bring new application areas.

  3. Research Issues I • Session Management • Developing a XML-based control framework where XML messages are also used for information exchange. • How much is this framework flexible to achieve an integrated collaboration system. • Extending the framework to cellular clients • Metadata Description and Management • Describing collaboration session metadata in XML format • Utilizing WS-Context which is a fault tolerant metadata repository and can be shared among all services and clients in the system. • Investigating the impact of this on the design and architecture of the streaming service

  4. Research Issues II • Multiple Streams Issues • Data format independent generic streaming framework • Services required within messaging middleware to • archive and replay of streams • achieve instant replay of streams • achieve synchronization among streams generated at geographically large area • Increase fault tolerance • Jitter introduced by archiving and replay service • How much delay for LAN and WAN clients • Application Scenarios • Proposing and developing an annotation system where it can receive streams from real-time live videoconferencing sessions with instant replay capability

  5. GlobalMMCS Prototype System

  6. XML Based General Session Protocol (XGSP) • XGSP is a conference control framework. • The goal of XGSP is to integrate heterogeneous systems into one collaboration system. • It can be viewed as a common A/V signaling protocol designed to support interactions between different A/V collaboration endpoints. • Enables different A/V endpoints to collaborate in the same collaboration session. • Includes three components; user session management, application session management and floor control.

  7. Global Multimedia Collaboration System (GlobalMMCS) • A prototype system to verify and refine XGSP conference control framework. • The current prototype of GlobalMMCS includes: • A XGSP media server • H.323, SIP gateways and Real Servers for A/V clients • XGSP A/V Session Server • The web server

  8. Extending GlobalMMCS to Mobile Clients

  9. Services Built Within Messaging Middleware • Time Services • Jitter Reduction Service • Replication Scheme (Repository Redundancy)

  10. Time Services • NB-NTP Time Service • An implementation of Network Time Protocol (RFC 1305) • NTP is used to synchronize timekeeping among a set of distributed time servers and clients. • Entities generating events in the system should utilize Time Service to timestamp the events. • High Resolution Timing Service • Implemented for Windows, Linux and Solaris • Gives 2-3 usec resolution

  11. Test Result • The first offset value is -139895 ms, which shows how much the clock in that machine is ahead of the real time. • The change of offsets is between (-3) - (2) ms.

  12. Jitter Reduction Services • Buffering Service • Time-order events. • Time Differential Service • Releases events preserving the time spacing between events. • Can achieve msec resolution

  13. Jitter Reduction Service Test Results

  14. Repository Redundancy • Extended NB Reliable Delivery Scheme to ensure that reliable delivery guarantees are satisfied in the presence of repository failures. • Each repository functions autonomously and makes decisions independently. • A repository can recover from any other repository as long as the missing event exists in that repository. • steering repository: A publisher or subscriber to a reliable-topic can interact with exactly one repository. • The repository operates in the active mode for steered clients and in the passive mode for clients that it does not steer.

  15. Repository Redundancy Test Results P1 Publisher S1 Subscriber (S1: Measuring client) Topology C Topology D Topology F Topology E

  16. Repository Redundancy Test Results

  17. Generic Streaming Framework and Metadata Management • Metadata Management • Generic Archiving and Replay • Session Recorders • Session Players • GlobalMMCS Recording and Replay

  18. Metadata Management • We use WS-Context service as a metadata repository. • WS-Context provides us distributed and fault tolerant metadata repository which can be shared among every entity in the system. • Updates in metadata are published to interested entities by WS-Context service. • Two levels of management: session level and intra-session level • Sessions level management is to keep track of sessions • Intra-session level management is to keep track of streams in the session

  19. Metadata Management (Session level)

  20. GlobalMMCS Session Management

  21. Archive and Replay Session Management

  22. Generic Archiving and Replay Framework • A generic framework for recording and replay of any type of streaming event or data. • Instant replay of streams: Real-time (live) streams can be replayed, paused and rewound while streams are being recorded. • Stream linkage: Multiple streams are linked together to construct a session. • A collaboration session can be recorded and replayed within this framework. Examples; • Anabas • GlobalMMCS • eSports System

  23. Uniform Event Type For Generic Framework • Received events are wrapped inside NaradaBrokering native events (NBEvent) with some event specific information. • Received event is placed to the payload of the NBEvent. • NBEvent also contains timestamp information and event type.

  24. Session Recorders Control message

  25. Session Players • The primary purpose of session player is to simulate clients in the original session. • Supports instant replay of real-time live streams that are being recorded. • Session players support replay, pause, rewind and fast forward operations. When one of those operations is requested, it is applied to all of the topics (streams) in that session.

  26. Session Players Control message

  27. GlobalMMCS Session Recording and Replay Replay Recording

  28. eSports System – Capabilities provided to eSports System • Archive and replay of NaradaBrokering native events • Archive and replay of GlobalMMCS sessions • Instant replay • Utilizing WS-Context Service • Transporting messages through NaradaBrokering messaging middleware

  29. eSports System and Streaming Services

  30. eSports System Interface (Recording)

  31. eSports System Interface (Replay)

  32. eSports System – Taking snapshots from Video Players

  33. Performance Results (Test Setup) LAN Setup : gf4.ucs.indiana.edu WAN Setup (FSU): vlab2.scs.fsu.edu WAN Setup (USCD): synseis.geongrid.org

  34. Performance Tests (LAN Results) Transport delay: 3513.7 - 3510.8 = 2.9 msec

  35. Performance Tests (WAN – FSU Results) Transport delay: 3502.4 - 3484.8 = 17.6 msec

  36. Performance Tests (WAN – UCSD Results) Transport delay: 3462.2 - 3426.8 = 35.4 msec

  37. Contribution • Proposed and implemented a scalable and fault tolerant services-based architecture that integrates videoconferencing systems, streaming media and annotation systems. • XML based control and messaging framework for collaboration systems. • Architecture allows cellular clients to receive real-time video conferencing streams. • Using a shared fault tolerant metadata repository (WS-Context Service) to keep session and stream information. • A data format independent generic streaming framework for archive and replay of streams built on top messaging middleware systems with the help of service oriented architecture and Web Services technologies

  38. Contribution • Instant replay of real-time live streams • Allows annotation systems to annotate videoconferencing streams • Following services are introduced to messaging middleware to increase the quality-of-service of streams and fault tolerant of the collaboration system; • Time Service • Jitter Reduction Service • Replication Scheme • RTSP semantics introduced to messaging systems.

More Related