DIS Presentation

DIS Presentation High Performance Web Site 指導老師：莊裕澤教授 R88725030 劉朝銘R88725037 程左一

Outline Performance issues Typical Methods Case study 1/57

Performance Issues and Related Work The overall performance of the web is determined by the performance of the components which make up the web: the servers, the clients, the proxies, the networks, and the protocols used for communication. 1997, Martin F. Arlitt 2/57

Performance Issues and Related Work Caching still appears to be a promising approach to improving Web performance because of the large number of requests for a small number of documents, the concentration of references within these documents, and the small average size of these documents. ~ 1997, Martin F. Arlitt 3/57

The clients Client connection bandwidth – The Web server has to maintain open connection with slow clients while request is being satisfied. Efficient Web browsers can use file caching to reduce the loads that they put on web servers and network links. Client computing - Java applets. 4/57

The servers Identify performance bottlenecks and evaluate the performance impact of different Web server designs to reduce response times. Use file caching in reducing Web server loads. CGI improvement, dispatcher, replication, caching, system calls 5/57

The proxies Researchers at several institutions are studying various cache replacement policies for Webs proxies: The most widely deployed proxy server,Squid, uses the LRU replacement policy. ( http://squid.nlanr.net/ ) 6/57

The networks Several studies have suggested the use of network file caches to reduce the volume of traffic on the Internet. Wide-area file system with WWW, such as AFS, have mechanisms to address performance, reliability and security. Researchers at NLANR have implemented a prototype hierarchy of caches, and are currently focusing on configuring and tuning caches within the global hierarchy.(Squid) 7/57

The protocols The current protocol used for client-server interaction within the WWW(HTTP/1.0) is very inefficient. Connect -> request -> response -> disconnect stateless A more efficient approach(HTTP-NG) would allow for multiple client requests to be sent over a single TCP connection. 8/57

Key Improvements in HTTP/1.1 Bandwidth Optimization Allow a client to request portions of a resource. Persistent Connection Clients, servers, and proxies assume that a connection will be kept open. Pipelining A client need not wait to receive the response for one request before sending another request on the same connection. 9/57

Components Affecting Web Server Performance • Server software: Architecture, Protocol, CGI programs, caching, mirroring, security, application level I/O buffer, compiler performance options, processing on the client side. • Operating system • The network ~1996, Vittorio Trecordi and Alberto Verga 10/57

CGI programs • CGI forking and executing overhead is very high. • New mechanisms based on dynamic linking and other private API increase performance. • This kind of approach provides the efficiency of native machine code execution. • Apache Web server uses GNU dld to add, remove, replace object modules within the process address space. 11/57

CGI programs (cont.) • To allow clients to execute server programs without spawning separate processes each time. • This can be accomplished by linking server programs directly with the Web server or preforking multiple processes to threads which the Web server communicates with to invoke server programs. • For example, IBM ICAPI, Netscape NSAPI, Microsoft ISAPI. 12/57

Dynamic Load Balancing • Main load balancing techniques • Client-based approach • Web clients • Client-side proxies • DNS-based approach • Constant TTL algorithms • Adaptive TTL algorithms 13/57

Dynamic Load Balancing • Dispatcher-based approach • Packet rewriting • Packet forwarding • HTTP redirection • Server-based approach • HTTP redirection • Packet redirection 14/57

Client-based Approach • Web clients • Require software modification on the client side • One example: wwwi.netscape.com • Limited practical applicability and not scalable • Web-client scheduling via smart clients • Increased network traffic due to message exchanges • Provide scalability and availability • Lacks client-side portability 15/57

Client-based Approach • Client-side proxies • Limited applicability • Combine caching with server replication 16/57

DNS-based Approach • Transparency at the URL level • DNS servers • Authoritative DNS server • Intermediate name server • TTL—a validity period for caching the result of the logical name resolution • Factors that limit the DNS control on address caching 17/57

DNS-based Approach • Noncooperative intermediate name server • Browser caching • DNS as a potential bottleneck 18/57

DNS-based Approach 19/57

DNS-based Approach • Constant TTL Algorithms • System-stateless algorithms • Round-robin (by NCSA) • Server-state-based algorithm • Aviod system overload • Sun-SCALR • Choose the least-loaded server, set TTL to zero 20/57

DNS-based Approach • Client-state-based algorithms • Two kinds of information come from client side • Multitier round-robin policy • Hidden load weight index • Cisco Systems’ DistibutedDirector • Client-to-server topological proximity • Client-to-server link latency • Internet2 Distributed Storage Infrastructure Project (I2-DSI) • Based-on network proximity(e.g, round-trip delay) • Problem with intermediate name server • Server and client-state-based algorithms 21/57

DNS-based Approach • Adaptive TTL algorithms • Heterogeneity of server capacities • Use some server- and client-based DNS policy to select the server, and dynamically adjust the TTL value • Popular domain with lower TTL value 22/57

Dispatcher-based Approach • Centralizing request control and routing • Single, virtual IP address(IP-SVA) of dispatcher • Simple dispatching algorithms • Routing schemes: • Packet rewriting • Packet forwarding • HTTP redirection 23/57

Dispatcher-based Approach • Packet single-rewriting • TCP router as an IP address dispatcher • Can combine with DNS-based approach • Packet double-rewriting • Rewriting both arrival and response packets • Underlies the Internet Engineering Task Force’s Network Address Translator 24/57

Packet Single-rewriting 25/57

Packet Double-rewriting 26/57

Dispatcher-based Approach • Two example: • Magicrouter • Cisco Systems’ LocalDirector • Packet Forwarding • Network Dispatcher(IBM) • LAN Network Dispatcher • WAN Network Dispatcher 27/57

WAN Network Dispatcher 28/57

Dispatcher-based Approach • ONE-IP address • Use “if config alias” option • Publicizes the IP-SVA as the same secondary IP address of all nodes • Routing-based dispatching—static • Broadcast-based dispatching—dynamic • The ONE-IP approach can be combined with a DNS-based solution 29/57

Dispatcher-based Approach • HTTP Redirection • Largely transparent, increased response time • Can be implemented with: • Server-state-based dispatching • Use Distributed Server Groups architecture • Add new method to HTTP protocol • Location-based dispatching • Used by CISCO Systems’ DistributedDirector 30/57

Server-based Approach • HTTP Redirection • Request initially assigned by DNS to a Web server, can be reassigned to another server via HTTP redirection • Overhead in intra-cluster communication 31/57

Server HTTP Redirection 32/57

Server-based Approach • Packet Redirection • Distributed Packet Rewriting (DPR) • Round-robin by DNS, then packets can be rerouted by packet-rewriting mechanism • Two load balance algorithms • Static(stateless) routing • Least-loaded (by periodic server communications) • DPR can be applied to both LAN and WAN 33/57

Comparing the approaches 34/57

Typical Methods File caching, web caching – proxy Web server tuning multithreading, memory-mapped I/O Load balancing multi-node Web server (dispatching) Replication Protocol improvement Resolving dynamic content problem 35/57

Case Study 1: NCSA’s World Wide Web Server (1995) Extant access patterns and responses at NCSA’s WWW server: At peak times, the NCSA server receives 30-40 requests per second, no single processor system could serve all requests. HTTP is connectionless and separate. Three key problems: Information addressing. Information distribution. Load balancing. 36/57 by Thomas T. Kwan and Daniel A. Reed

Case Study 1: Architecture A group of loosely coupled WWW servers: they operate independently, but collectively they provide the illusion of a single server. Three components: A collection of independent servers. A WWW document tree shared among the servers and stored by the Andrew distributed file system. A round-robin domain name system that multiplexes the domain name www.ncsa.uiuc.edu among the constituent servers. 37/57

Case Study 1: the servers and the network The WWW servers are connected to the AFS file servers via a 100-mb/s FDDI ring. Each HP735 has 96 MB RAM and uses its local disk as a moderate-size(130 MB) AFS cache. The local disk stores HTTP server log files and is the backing store for the virtual memory system. 38/57

Case Study 1: AFS configuration The shared document tree structure. The most frequently accessed documents and document tree are cached locally.( cache hit rate -> 90% ; critical!) Allow “plug and play” addition/removal of component servers and use of heterogeneous systems. 39/57

Case Study 1: Round-robin DNS The BIND 4.9.2 code has a round-robin option, but the rotation conflicted with extant software at NCSA. The BIND software was modified to allow a domain name with more than one associated IP address. Adding a new server to the group is as simple as adding its IP address to the DNS entry for www.ncsa.uiuc.edu . 40/57

Case Study 2: Improvement of the Apache Web Server(1999) Implement 6 techniques that improve the throughput of Apache by 61%. Performance analysis(SPECweb96, WebStone): Apache spends more than 75-80% of its CPU time on OS kernels. Small files are accessed more frequently. Experimental Environments: both uniprocessor and 4-way SMP machines with enough RAM. 41/57 by Yiming Hu, Ashwini Nanda, and Qing Yang

Case Study 2: Detailed CPU Time Breakdown 23% - user code 29% - TCP/IP 26% - interrupt 17% - file operation It is possible to greatly reduce the file system overhead and the user code overhead！ 42/57

Case Study 2: mmap function When Apache needs to read a file, we let the program open the file, then use the mmap function to map the data of the file into user space. Use the mmap function to eliminate the data copying between the file system cache and the user space. As a result, the server can send the mapped data directly to the network, avoiding the read system call altogether. When the entire file content is sent out, the file is unmapped and closed to limit the number of opened files in the system. 43/57

Case Study 2: Caching files in the User Space Caching the file data in the Web server user space can directly ship the cached files to the TCP/IP stack, avoiding all file system calls such as open, read and close. The caches stores both the file data and the file states in a user memory region shared by all processes. 44/57

Case Study 2: Speeding up Logging On the uniprocessor system with 128MB RAM, the SPECweb96 number of Apache with logging is 158 ops/sec. Simply turning off the logging operation results in a SPECweb96 number of 194 ops/sec. ( 22% improvement ) It is desirable to reduce the logging overhead of Apache. 45/57

Case Study 2: Caching DNS results The majority of overheads of logging comes from looking up the host names of clients. Apache calls the getnamebyaddr function for every HTTP request being logged. (very wasteful, especially no local name server) The DNS cache is a small array indexed by hashing the IP address. (14% improvement) Or disable the host name lookup process. 46/57

Case Study 2: Caching Strings results Another logging overhead is caused by logging the request time and status code. Let the time-conversion routine store the resulting string in a static array. When Apache calls the conversion routine again , the routine returns the result in the array if the current time argument is the same as the last call, thus avoiding many redundant string operations. Similar optimizations can be applied to the status code conversion and other places in the Apache. 47/57

Case Study 2: Caching URI Processing results 60% of the user time are for processing URIs(e.g., parsing , directory checking, security checking, translating the URI to a file name, etc.) When Apache finishes parsing and checking a URI and obtains a file name corresponding to the URI, the URI/file-name pair is put into the cache. Especially useful for dynamic pages such as directory lists. 48/57

Case Study 2:Results of the Enhancement Techniques +DNS_cache+str_cache+mmap+state_cache+data_cache +DNS_cache+ str_cache+mmap +DNS_cache+ str_cache +DNS_cache+ str_cache+mmap+state_cache+data_cache+URI_cache apache +DNS_cache+ str_cache+mmap+state_cache +DNS_cache 49/57

DIS Presentation

DIS Presentation

Presentation Transcript

Prefixes – ‘dis’

Dis-information

DiS

ISO/DIS 16159

DIS 2012

DIS SYSTEMS

Open-DIS

VR-DIS

PRESTO (DIS 22951)

Academic (Dis)Honesty

DIS Revision

DIS 2013

DIS Services

Pavla Lovečková DiS.