Measuring The Capacity of a Web server

Measuring The Capacity of a Web server By Xiongjun Tang

Topics • Introduction • Dynamics of HTTP Server • Problems in generating Synthetic HTTP requests • A Scalable method for Generating HTTP requests • Quantitative Evaluation • Conclusion

Introduction • Recently improvement on performance of web server • Better Web cashing • Application-level cashing (L7 cluster) • HTTP protocol enhancement • data compression • Better HTTP servers and proxies • Server OS implementation • tuning parameter in OS to get better performance

Introduction (continued) • Measuring Web server (characters) • request file types • transfer data sizes • locality of reference in URL • Using real workload directly

Introduction (continued) • Simple scheme of WEB client Generator • client establishes connection • send a HTTP request • receive response • wait for a certain time (Think time) • repeat the cycle • Adding client processes = increase total client requests

Introduction (continued) • Problems with naive method • Hard to exceed server’s capacity • Little resemblance in temporal characteristic to real-world Web traffic • Difference between delay and lose of WAN and LAN • limited resource on client machine

HTTP Connection Timeline 1. A web server listen for connection 2. Receive a client request (SYN packet) 3. Server TCP responds SYN-ACK packet, create socket for new,incomplete connection, and place it in SYN-RECN queue 4. Client responds ACK, server move socket created above from SYN-RECV to accept queue 5. Server remove first socket in queue and send back appropriate responses, then close connection

Limitation of TCP Implementation • Most system has a maximum number of backlog (Sum of length of SYN-RCVD queue and Accept Queue) • if sum > 1.5*backlog, server will drop incoming SYN packet • when TCP misses SYN-ACK packet, it goes into a exponential back-off status, For BSD system, it will send request at 6 seconds and 30 seconds after first SYN is sent before finally give up at 75 seconds. • If no SYN-ACK, it will only send 3 requests for during 75 seconds! • The Average length of SYN-RCVD depends on request rate and round-trip delay between client and server • long round-trip delay and high request rate increase SYN-RCVD length • Accept queue length depends on CPU handling speed and request rate

Problems in Generating Synthetic HTTP request • 1. Inability to generate Excess load • In real world HTTP requests are generated by huge number of clients • large mean and variance • requests are bursty ( such as France 98) • peak request can easily exceeds capacity of server • In simple model • small mean and variance • little burstiness

Why can’t simple method generate excess load • A new request can only be generate after a pervious one completed • when clients increase, the queue increases,so it will take CPU longer time to finish a connection, thus completed time increases, so request generating decrease. • The net connection request rate of all clients will remain equal to throughput of server • when clients is greater than server’s maximum backlog, server is beginning to drop SYN packet. • TCP exponential backoff will happen, generate further requests at very low rate

Request Rate vs. no. of clients • A example How many client will be requested to generate 1100 requests/sec? y-100 = (x-1024)*0.04 --> x = 1024 + (y-100)/0.04 if y = 1100 then x = 1024 + 1000/0.04 --> x = 1024 + 25000 (on paper is order of 15000) • If Max connection = 32767 1.5*32767+15000 = 64151 to generate 1100 requests/sec.

Additional Problem • Simple Method does not model high and variable WAN delay • Resource constrain in client machine • if too many processes, the contention for CPU and memory will increase. • Potential bottleneck • server is OK, but clients wait for resource

A Scalable Method for Generating HTTP request • Total P machines • Each machine running several S-clients • A Router can be used to simulate wan delay

S-Client • Created by a UNIX domain socketpair call • it has two processes, one is connection establishment process, the other is connection handling process • connection establishment process • purpose: to generate HTTP request at certain rate and with certain distribution • open D connections by using D sockets, requests are spaced out over T ms • after each socket is created, a timer is associated with it. • If in time T, it get the response from server, it will hand off to connection handling process. Close this socket, initiate another connection to server • if in time T, it don’t get response from server, close it, then initiate another connection to server • all avoid TCP exponential back-off

S-Client (continued) • Connection handling process • waiting data to arrive on any of active connections • if any new data coming, read it. If this completes, close the socket. • waiting for new connection to arrive on UNIX domain socket connecting to the other process • simply added to active pool of active connections

A S-client • Two key ideas: 1.shorten TCP connection timeout this will allow generati- on of request rate bey- ond capacity of server increase at least 1/T 2.maintain a constant number of unconnected sockets this will ensure generated request rate is independent of the rate at which server handles request

Request generating capacity of a client machine • Purpose • To use as less S-client to generate as many requests • Choose largest allowable number of descriptors(N) • How to get it ? • Choose a value largest value N, for which throughput Vs request rate curve when using 1 client machine is unchanged from the same curve when using 2 client machines.

Quantitative Evaluation • Each HTTP request is for a single file of size 1294 • No more than 130 requests/sec for simple method • with S-clients, up to 2065 requests/sec (limitation of machine resource) • T value for S-client is 500ms

Overload behavior • Why Dropped ? • Because CPU resource spent on protocol processing for incoming requests (SYN packet)

Bursty Condition • First parameter: the ration between Max request rate and average request rate • second parameter: the fraction of time for which request of rate exceed average rate • In general high busrtiness both in above two parameters degrades the throughput of server substantially

Conclusion • This paper examines pitfall in process of generating synthetic web server workload consisting of a small number of client machine • It exposes the limitation of simple method • A new method (S-client) is introduced, which can easily generate workload exceeding capacity of server as well as bursty workload • it will help study of the characters when server is in overload status and then improve the performance of server

Measuring The Capacity of a Web server

Measuring The Capacity of a Web server

Presentation Transcript

Web Server

A Portable Web Server

Configuring a Web Server

Measuring the Quality of Web Artifacts

Death of a Web Server

Measuring Lung Capacity

Creating a Web Server

Measuring the Requirements Allocation Capacity within a System of Systems

Web server

Measuring the Capacity of a Web Server

Headroom A Measure of Server Remaining Capacity

Measuring the Semantic Web

Capacity—Measuring Liquid

WEB Server Based Distributed Measuring System

Capacity—Measuring Liquid

Capacity—Measuring Liquid

A ‘minimal’ web-server

WEB SERVER

Measuring the Size of the Web

A ‘minimal’ web-server

WEB Server Based Distributed Measuring System