1 / 45

Scalable Network Services (SNS) Architecture for Internet Applications

This paper discusses the Scalable Network Services (SNS) architecture, which aims to address the issues of system scalability and availability in cluster-based Internet applications. The architecture separates the concerns of scalability and fault tolerance from application logic, providing a reusable substrate for developing network services. The paper also introduces the TACC model for structuring services and provides examples of services built using the TACC framework.

vancey
Télécharger la présentation

Scalable Network Services (SNS) Architecture for Internet Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Topics ACID vs BASE Starfish Availability TACC Model Transend Measurements SNS Architecture

  2. Extensible Cluster Based Network Services Armando Fox Steven Gribble Yatin Chawathe Eric Brewer Paul Gauthier University of CaliforniaBerkeley Inktomi Corporation Presenter: Ashish Gupta Advanced Operating Systems

  3. Motivation • Proliferation of network-based services • Two critical issues must be addressed by Internet services: • System scalability • Incremental and linear scalability • Availability and fault tolerance • 24x7 operation Clusters of workstations meet these requirements

  4. Commodity PCs as unit of scaling Good Cost/performance Incremental Scalability “Embarrassingly parallel” workloads Map well onto workstations Redundancy of clusters Masks transient failures

  5. Contribution of this work Isolate common requirements of cluster-based Internet apps into a reusable substrate the Scalable Network Services (SNS) framework Goal: complete separation of *ility concerns from application logic • Legacy code encapsulation • Insulate programmers from nasty engineering

  6. Contribution of this work • Architecture for SNS, exploiting the strength of cluster computing • Separation of content of network services from implementation • Encapsulation of low level functions in a lower layer • Example of a new service • A Programming Model to go with the architecture

  7. W W W A Interconnect W W W T The SNS architecture Workers and Front-ends All control decisions for satisfying user requests localized in the front-ends: Which Servers to invoke, access profile database, notify the end-user etc. Workers simple and stateless • Behaviour of service defined entirely at the front-end • Analogy of processes in a Unix pipeline: ls –l | grep .pl | wc User ProfileDatabase Caches Front Ends C FE $ $ $ FE Workers FE GUI LB/FT Manager: Load Balancing & Fault Tolerance AdministrationInterface

  8. Service Service Specific Code TACC Transformation, Aggregation, Caching, Customization SNS Scalable Network Service Support Separating the content from implementation Layered Software model Previous Components SNS Provides Scalability Load Balancing Fault tolerance High Availability

  9. The SNS Layer • Scalability • Replicate well-encapsulated components • Prolonged Bursts: Notion of Overflow Pool • Load Balancing • Centralized: Simple to implement and predicable

  10. The SNS Layer • Soft State for fault-tolerance and availability • Process peers watch each other • Because of no hard state, “recovery” == “restart” • Load balancing, hot updates, migration are “easy” • Shoot down a worker, and it will recover • Upgrade == install new software, shoot down old • Mostly graceful degradation

  11. W W W A Interconnect W W W T “Starfish” Availability: LB Death FE detects via broken pipe/timeout, restarts LB C FE $ $ $ FE FE LB/FT

  12. W W W A Interconnect W W W T LB/FT “Starfish” Availability: LB Death New LB announces itself (multicast), contacted by workers, gradually rebuilds load tables FE detects via broken pipe/timeout, restarts LB If partition heals, extra LB’s commit suicide FE’s operate using cached LB info during failure C FE $ $ $ FE FE LB/FT

  13. The TACC Model a model for structuring services Transformation Aggregation Caching Customization Operation on a single data object that changes its content Collecting data from several sources and collating it Storing/re-computing easier than moving across internet Can also store post-transformation (or post-aggregation) content Per user: for content generation Per device: data delivery, content “packaging” C T Question: How do we build the services in the higher layers?

  14. The TACC Model a model for structuring services Programming model based on composable building blocks Many existing services fit well within the TACC model

  15. A Meta-Search Engine In TACC • Uses existing services to create a new service • 2.5 hours to write using TACC franework Internet Metasearch Web UI

  16. An Example ServiceTRANSEND

  17. Datatype-Specific Distillation • Lossy compression that preserves semantic content • Tailor content for each client • Reduce end-to-end latency when link is slow • Meaningful presentation for range of clients 6.8x 65x 1.2 The Remote Queue Model We introduce Remote Queues (RQ), ….

  18. TranSend SNS Components • Workers = Distillers here • Simple restart mechanism for fault-tolerance • Each distiller took 5-6 hrs to write • SNS Fault tolerance removes worries about occasional bugs/crashes

  19. Measurements • Request Generation: • High performance HTTP request playback engine • Burstiness • Handled by the overflow pool

  20. Load Balancing Metric: Queue Length at distillers Load reaches threshold: Manager spawns a new distiller

  21. Scalability Strategy: Begin with minimal instance Increase offered load until saturation Add more resources to eliminate saturation Observations: Nearly perfect linear growth 1 Distiller ~ 23 requests/sec Front end ~ 70 requests/sec Ultimate bottleneck: Shared components of the system (Manager and the SAN) SAN could be bottleneck for communication-intensive workloads (Example of 10Mb/s eth) Topic for future research

  22. Conclusion • A layered architecture for cluster-based scalable network services • Authors shielded from software complexity of automatic scaling, high availability, and failure management • New services as composition of stateless workers A useful paradigm for deploying new Internet services

  23. ACID vs BASE semantics An approximate answer delivered quickly is more useful than the exact answer slowly

  24. ACID vs BASE semantics An approximate answer delivered quickly is more useful than the exact answer slowly Search Engine as a database • 1 Big table • Unknown but large growth • Must be truly highly available

  25. A DBMS would be too slow Choose availability over consistency Graceful degradation: OK to temporarily lose small random subsets of data due to faults Atomicity BASE Basically Available Soft-State Eventual Consistency Replace with Availablity Graceful degradation Performance Consistency Isolation Durability Database research is about ACID

  26. Why BASE ? Idea: focus on looser semantics rather than ACID semantics • ACID => data unavailable rather than available but inconsistent • BASE => data available, but could be stale, inconsistent or approximate • Real systems use BOTH semantics • Claim: BASE can lead to simpler systems and better performance • Performance: caching and avoidance of communication and some locks (e.g. ACID requires strict locking and communication with replicas for every write and any reads without locks) • Simpler: soft-state leads to easy recovery and interchangable components • BASE fits clusters well due to partial failure

  27. More BASE… • Reduces complexity of service implementation , consistency for simplicity • Fault Tolerance • Availability • Opportunities for better performance optimizations in the SNS framework • ACID : durable and consistent state across partial failures • This Is relaxed in the BASE model • Example of HotBot

  28. THANK You

  29. Backup Slides

  30. Question • Why are the cluster-based network service well suited to internet service

  31. answer • The requirements are highly parallel( many indepent simultaneous users) • The grain size typically corresponds to at most a few CPU seconds on a commodity PC

  32. Question 2 • Why does the cluster-base network service use BASE semantics?

  33. Answer: • BASE semantics allow us to handle partial failure in clusters with less complexity and cost.

  34. Question 3 • When the overflow machines are being recruited unusually often, what should be done at this time?

  35. Answer: • It is time to add new machines.

  36. Question 4 • Does the Front-end crash not lost any information? If does, what kind information will be lost?

  37. Answer: • User requests will be lost and user need to handle timeout and resend request.

  38. Clustering and Internet Workloads • Internet vs. “traditional” workloads • e.g. Database workloads (TPC benchmarks) • e.g. traditional scientific codes (matrix multiply, simulated annealing and related simulations, etc.) • Some characteristic differences • Read mostly • Quality of service (best-effort vs. guarantees) • Task granularity • “Embarrasingly parallel”…why? • HTTP is stateless with short-lived requests • Web’s architecture has already forced app designers to work around this! (not obvious in 1990)

  39. Meeting the Cluster Challenges • Software & programming models • Partial failure and application semantics • System administration • Two case studies to contrast programming models • GLUnix goal: support “all” traditional Unix apps, providing a single system image • SNS/TACC goal: simple programming model for Internet services (caching, transformation, etc.), with good robustness and easy administration

  40. AltaVista hardware • An AltaVista system consists of six computers: • AltaVista (external traffic, HTTP server): 250 MT main memory, 6 GB plate • Indexer (indicates HTML documents): 10 processors, 6 GB main memory, 210 GB plate • Scooter (Robot): 1,5 GB main memory, 30 GB plate • Vista (Scooter output processes): 2 processors, 2 GB main memory, 180 GB plate • News Indexer: 896 MT main memory, 13 GB plate • News server: 896 MT main memory, 24 GB plate

  41. GOOGLE • Google's hardware is a massive "farm" of more than 10,000 servers, capable of not only indexing more than 3 billion web documents but handling thousands of queries per second with sub-second response times. It's an awesome engineering feat in its own right.

  42. GOOGLE: LB and FT • Google's application makes expensive proprietary hardware unsuitable, says Reese. "We are not like a transaction-based e-Commerce site, where it makes sense to spend a whole lot of money on some really big server iron and storage area network. We architected our solution to be scalable by using smaller servers that are multiply redundant and very fast through load balancing. Also it makes us very fault tolerant—we can lose a whole cluster or clusters, and we'll still be fine."

More Related