Web Servers: Implementation and Performance

Web Servers: Implementation and Performance Erich Nahum IBM T.J. Watson Research Center www.research.ibm.com/people/n/nahum nahum@us.ibm.com Erich Nahum

Contents and Timeline: • Introduction to the Web (30 min): • HTTP, Clients, Servers, Proxies, DNS, CDN’s • Outline of a Web Server Transaction (25 min): • Receiving a request, generating a response • Web Server Architectural Models (20 min): • Processes, threads, events • Web Server Workload Characteristics (30 min): • File sizes, document popularity, embedded objects • Web Server Workload Generation (20 min): • Webstone, SpecWeb, TPC-W Erich Nahum

Things Not Covered in Tutorial • Client-side issues: HTML rendering, Javascript interpretation • TCP issues: implementation, interaction with HTTP • Proxies: some similarities, many differences • Dynamic Content: CGI, PHP, EJB, ASP, etc. • QoS for Web Servers • SSL/TLS and HTTPS • Content Distribution Networks (CDN’s) • Security and Denial of Service Erich Nahum

Assumptions and Expectations • Some familiarity with WWW as a user (Has anyone here not used a browser?) • Some familiarity with networking concepts (e.g., unreliability, reordering, race conditions) • Familiarity with systems programming (e.g., know what sockets, hashing, caching are) • Examples will be based on C & Unix taken from BSD, Linux, AIX, and real servers (sorry, Java and Windows fans) Erich Nahum

Objectives and Takeaways After this tutorial, hopefully we will all know: • Basics of the Web, HTTP, clients, servers, DNS • Basics of server implementation & performance • Pros and cons of various server architectures • Characteristics of web server workloads • Difficulties in workload generation • Design loop of implement, measure, debug, and fix Many lessons should be applicable to any networked server, e.g., files, mail, news, DNS, LDAP, etc. Erich Nahum

Acknowledgements Many people contributed comments and suggestions to this tutorial, including: Abhishek Chandra Mark Crovella Suresh Chari Peter Druschel Jim Kurose Balachander Krishnamurthy Vivek Pai Jennifer Rexford Anees Shaikh Srinivasan Seshan Errors are all mine, of course. Erich Nahum

Chapter 1: Introduction to the World-Wide Web (WWW) Erich Nahum

Introduction to the WWW http request http request • HTTP: Hypertext Transfer Protocol • Communication protocol between clients and servers • Application layer protocol for WWW • Client/Server model: • Client: browser that requests, receives, displays object • Server: receives requests and responds to them • Proxy: intermediary that aggregates requests, responses • Protocol consists of various operations • Few for HTTP 1.0 (RFC 1945, 1996) • Many more in HTTP 1.1 (RFC 2616, 1999) http response http response Client Proxy Server Erich Nahum

How are Requests Generated? • User clicks on something • Uniform Resource Locator (URL): • http://www.nytimes.com • https://www.paymybills.com • ftp://ftp.kernel.org • news://news.deja.com • telnet://gaia.cs.umass.edu • mailto:nahum@us.ibm.com • Different URL schemes map to different services • Hostname is converted from a name to a 32-bit IP address (DNS resolve) • Connection is established to server Most browser requests are HTTP requests. Erich Nahum

How are DNS names resolved? • Clients have a well-known IP address for a local DNS name server • Clients ask local name server for IP address • Local name server may not know it, however! • Name server has, in turn, a parent to ask (the “DNS hierarchy”) • The local name server’s job is to iteratively query servers until name is found and return IP address to server • Each name server can cache names, but: • Each name:IP mapping has a time-to-live field • After time expires, name server must discard mapping Erich Nahum

DNS in Action 200 OK index.html GET /index.html www.ipam.ucla.edu? 12.100.104.5 12.100.104.5 (TTL = 10 min) www.ipam.ucla.edu? ns.ucla.edu (TTL = 1d) www.ipam.ucla.edu? myclient.watson.ibm.com www.ipam.ucla.edu 12.100.104.5 ns.ucla.edu (name server) ns.watson.ibm.com (name server) A.GTLD-SERVER.NET (name server for .edu) Erich Nahum

What Happens Then? <html> <head> <meta name=“Author” content=“Erich Nahum”> <title> Linux Web Server Performance </title> </head> <body text=“#00000”> <img width=31 height=11 src=“ibmlogo.gif”> <img src=“images/new.gif> <h1>Hi There!</h1> Here’s lots of cool linux stuff! <a href=“more.html”> Click here</a> for more! </body> </html> • Client downloads HTML document • Sometimes called “container page” • Typically in text format (ASCII) • Contains instructions for rendering (e.g., background color, frames) • Links to other pages • Many have embedded objects: • Images: GIF, JPG (logos, banner ads) • Usually automatically retrieved • I.e., without user involvement • can control sometimes (e.g. browser options, junkbuster) sample html file Erich Nahum

So What’s a Web Server Do? • Respond to client requests, typically a browser • Can be a proxy, which aggregates client requests (e.g., AOL) • Could be search engine spider or custom (e.g., Keynote) • May have work to do on client’s behalf: • Is the client’s cached copy still good? • Is client authorized to get this document? • Is client a proxy on someone else’s behalf? • Run an arbitrary program (e.g., stock trade) • Hundreds or thousands of simultaneous clients • Hard to predict how many will show up on some day • Many requests are in progress concurrently Server capacity planning is non-trivial. Erich Nahum

What do HTTP Requests Look Like? GET /images/penguin.gif HTTP/1.0 User-Agent: Mozilla/0.9.4 (Linux 2.2.19) Host: www.kernel.org Accept: text/html, image/gif, image/jpeg Accept-Encoding: gzip Accept-Language: en Accept-Charset: iso-8859-1,*,utf-8 Cookie: B=xh203jfsf; Y=3sdkfjej <cr><lf> • Messages are in ASCII (human-readable) • Carriage-return and line-feed indicate end of headers • Headers may communicate private information • (browser, OS, cookie information, etc.) Erich Nahum

What Kind of Requests are there? Called Methods: • GET: retrieve a file (95% of requests) • HEAD: just get meta-data (e.g., mod time) • POST: submitting a form to a server • PUT: store enclosed document as URI • DELETE: removed named resource • LINK/UNLINK: in 1.0, gone in 1.1 • TRACE: http “echo” for debugging (added in 1.1) • CONNECT: used by proxies for tunneling (1.1) • OPTIONS: request for server/proxy options (1.1) Erich Nahum

What Do Responses Look Like? HTTP/1.0 200 OK Server: Tux 2.0 Content-Type: image/gif Content-Length: 43 Last-Modified: Fri, 15 Apr 1994 02:36:21 GMT Expires: Wed, 20 Feb 2002 18:54:46 GMT Date: Mon, 12 Nov 2001 14:29:48 GMT Cache-Control: no-cache Pragma: no-cache Connection: close Set-Cookie: PA=wefj2we0-jfjf <cr><lf> <data follows…> • Similar format to requests (i.e., ASCII) Erich Nahum

What Responses are There? • 1XX: Informational (def’d in 1.0, used in 1.1) 100 Continue, 101 Switching Protocols • 2XX: Success 200 OK, 206 Partial Content • 3XX: Redirection 301 Moved Permanently, 304 Not Modified • 4XX: Client error 400 Bad Request, 403 Forbidden, 404 Not Found • 5XX: Server error 500 Internal Server Error, 503 Service Unavailable, 505 HTTP Version Not Supported Erich Nahum

What are all these Headers? Specify capabilities and properties: • General: Connection, Date • Request: Accept-Encoding, User-Agent • Response: Location, Server type • Entity: Content-Encoding, Last-Modified • Hop-by-hop: Proxy-Authenticate, Transfer-Encoding Server must pay attention to respond properly. Erich Nahum

The Role of Proxies clients Internet proxy servers • Clients send requests to local proxy • Proxy sends requests to remote servers • Proxy can cache responses and return them Erich Nahum

Why have a Proxy? • For performance: • Many of the same web documents are requested by many different clients (“locality of reference”) • A copy of the document can be cached for later requests (typical document hit rate: ~ 50%) • Since proxy is closer to client, responses times are smaller than from server • For cost savings: • Organizations pay by ISP bandwidth used • Cached responses don’t consume ISP bandwidth • For security/policy: • Typically located in “demilitarized zone” (DMZ) • Easier to protect a single point rather than all clients • Can enforce corporate/government policies (e.g., porn) Erich Nahum

Proxy Placement in the Web proxy clients Internet proxy “reverse” proxy servers proxy • Proxies can be placed in arbitrary points in net: • Can be organized into hierarchies • Placed in front of a server: “reverse” proxy • Route requests to specific proxies: content distribution Erich Nahum

Content Distribution Networks proxy origin servers Internet proxy proxy • Push content out to proxies: • Route client requests to “closest” proxy • Reduce load on origin server • Reduce response time seen by client Erich Nahum

Mechanisms for CDN’s • IP Anycast: • Route an IP packet to one-of-many IP addresses • Some research but not deployed or supported by IPV4 • TCP Redirection: • Client TCP packets go to one machine, but responses come from a different one • Clunky, not clear it reduces load or response time • HTTP Redirection: • When client connects, use 302 response (moved temp) to send client to proxy close to client • Server must be aware of CDN network • DNS Redirection: • When client asks for server IP address, tell them based on where they are in the network • Used by most CDN providers (e.g., Akamai) Erich Nahum

DNS Based Request-Routing service.com? service.com? cdn 3 cdn 3 cdn 1 cdn 2 www.service.com cdn 5 cdn 3 cdn 4 request-routing DNS name server client local nameserver Erich Nahum

Summary: Introduction to WWW • The major application on the Internet • Majority of traffic is HTTP (or HTTP-related) • Messages mostly in ASCII text (helps debugging!) • Client/server model: • Clients make requests, servers respond to them • Proxies act as servers to clients, clients to servers • Content may be spread across network • Through either proxy caches or content distr. networks • DNS redirection is the common approach to CDNs • Various HTTP headers and commands • Too many to go into detail here • We’ll focus on common server ones • Many web books/tutorials exist (e.g., Krishnamurthy & Rexford 2001) Erich Nahum

Chapter 2: Outline of a Typical Web Server Transaction Erich Nahum

Outline of an HTTP Transaction • In this section we go over the basics of servicing an HTTP GET request from user space • For this example, we'll assume a single process running in user space, similar to Apache 1.3 • At each stage see what the costs/problems can be • Also try to think of where costs can be optimized • We’ll describe relevant socket operations as we go initialize; forever do { get request; process; send response; log request; } server in a nutshell Erich Nahum

Readying a Server s = socket(); /* allocate listen socket */ bind(s, 80); /* bind to TCP port 80 */ listen(s); /* indicate willingness to accept */ while (1) { newconn = accept(s); /* accept new connection */b • First thing a server does is notify the OS it is interested in WWW server requests; these are typically on TCP port 80. Other services use different ports (e.g., SSL is on 443) • Allocate a socket and bind()'s it to the address (port 80) • Server calls listen() on the socket to indicate willingness to receive requests • Calls accept() to wait for a request to come in (and blocks) • When the accept() returns, we have a new socket which represents a new connection to a client Erich Nahum

Processing a Request remoteIP = getsockname(newconn); remoteHost = gethostbyname(remoteIP); gettimeofday(currentTime); read(newconn, reqBuffer, sizeof(reqBuffer)); reqInfo = serverParse(reqBuffer); • getsockname() called to get the remote host name • for logging purposes (optional, but done by most) • gethostbyname() called to get name of other end • again for logging purposes • gettimeofday() is called to get time of request • both for Date header and for logging • read() is called on new socket to retrieve request • request is determined by parsing the data • “GET /images/jul4/flag.gif” Erich Nahum

Processing a Request (cont) fileName = parseOutFileName(requestBuffer); fileAttr = stat(fileName); serverCheckFileStuff(fileName, fileAttr); open(fileName); • stat() called to test file path • to see if file exists/is accessible • may not be there, may only be available to certain people • "/microsoft/top-secret/plans-for-world-domination.html" • stat() also used for file meta-data • e.g., size of file, last modified time • "Have plans changed since last time I checked?“ • might have to stat() multiple files just to get to end • e.g., 4 stats in bill g example above • assuming all is OK, open() called to open the file Erich Nahum

Responding to a Request read(fileName, fileBuffer); headerBuffer = serverFigureHeaders(fileName, reqInfo); write(newSock, headerBuffer); write(newSock, fileBuffer); close(newSock); close(fileName); write(logFile, requestInfo); • read() called to read the file into user space • write() is called to send HTTP headers on socket (early servers called write() for eachheader!) • write() is called to write the file on the socket • close() is called to close the socket • close() is called to close the open file descriptor • write() is called on the log file Erich Nahum

Optimizing the Basic Structure • As we will see, a great deal of locality exists in web requests and web traffic. • Much of the work described above doesn't really need to be performed each time. • Optimizations fall under 2 categories: caching and custom OS primitives. Erich Nahum

Optimizations: Caching Idea is to exploit locality in client requests. Many files are requested over and over (e.g., index.html). • Again, cache HTTP header info on a per-url basis, rather than re-generating info over and over. fileDescriptor = lookInFDCache(fileName); metaInfo = lookInMetaInfoCache(fileName); headerBuffer = lookInHTTPHeaderCache(fileName); • Why open and close files over and over again? Instead, cache open file FD’s, manage them LRU. • Why stat them again and again? Cache path name and access characteristics. Erich Nahum

Optimizations: Caching (cont) • Instead of reading and writing the data, cache data, as well as meta-data, in user space fileData = lookInFileDataCache(fileName); fileData = lookInMMapCache(fileName); remoteHostName = lookRemoteHostCache(fileName); • Even better, mmap() the file so that two copies don't exist in both user and kernel space • Since we see the same clients over and over, cache the reverse name lookups (or better yet, don't do resolves at all, log only IP addresses) Erich Nahum

Optimizations: OS Primitives • Rather than call accept(), getsockname() & read(), add a new primitive, acceptExtended(), which combines the 3 primitives acceptExtended(listenSock, &newSock, readBuffer, &remoteInfo); currentTime = *mappedTimePointer; buffer[0] = firstHTTPHeader; buffer[1] = secondHTTPHeader; buffer[2] = fileDataBuffer; writev(newSock, buffer, 3); • Instead of calling gettimeofday(), use a memory-mapped counter that is cheap to access (a few instructions rather than a system call) • Instead of calling write() many times, use writev() Erich Nahum

OS Primitives (cont) • Rather than calling read() & write(), or write() with an mmap()'ed file, use a new primitive called sendfile() (or transmitfile()). Bytes stay in the kernel. • While we're at it, add a header option to sendfile() so that we don't have to call write() at all. httpInfo = cacheLookup(reqBuffer); sendfile(newConn, httpInfo->headers, httpInfo->fileDescriptor, OPT_CLOSE_WHEN_DONE); • Also add an option to close the connection so that we don't have to call close() explicitly. All this assumes proper OS support. Most have it these days. Erich Nahum

An Accelerated Server Example acceptex(socket, newConn, reqBuffer, remoteHostInfo); httpInfo = cacheLookup(reqBuffer); sendfile(newConn, httpInfo->headers, httpInfo->fileDescriptor, OPT_CLOSE_WHEN_DONE); write(logFile, requestInfo); • acceptex() is called • gets new socket, request, remote host IP address • string match in hash table is done to parse request • hash table entry contains relevant meta-data, including modification times, file descriptors, permissions, etc. • sendfile() is called • pre-computed header, file descriptor, and close option • log written back asynchronously (buffered write()). That’s it! Erich Nahum

Complications • Much of this assumes sharing is easy: • but, this is dependent on the server architectural model • if multiple processes are being used, as in Apache, it is difficult to share data structures. • Take, for example, mmap(): • mmap() maps a file into the address space of a process. • a file mmap'ed in one address space can’t be re-used for a request for the same file served by another process. • Apache 1.3 does use mmap() instead of read(). • in this case, mmap() eliminates one data copy versus a separate read() & write() combination, but process will still need to open() and close() the file. Erich Nahum

Complications (cont) • Similarly, meta-data info needs to be shared: • e.g., file size, access permissions, last modified time, etc. • While locality is high, cache misses can and do happen sometimes: • if previously unseen file requested, process can block waiting for disk. • OS can impose other restrictions: • e.g., limits on number of open file descriptors. • e.g., sockets typically allow buffering about 64 KB of data. If a process tries to write() a 1 MB file, it will block until other end receives the data. • Need to be able to cope with the misses without slowing down the hits Erich Nahum

Summary: Outline of a Typical HTTP Transaction • A server can perform many steps in the process of servicing a request • Different actions depending on many factors: • e.g., 304 not modified if client's cached copy is good • e.g., 404 not found, 401 unauthorized • Most requests are for small subset of data: • we’ll see more about this in the Workload section • we can leverage that fact for performance • Architectural model affects possible optimizations • we’ll go into this in more detail in the next section Erich Nahum

Chapter 3: Server Architectural Models Erich Nahum

Server Architectural Models Several approaches to server structure: • Process based: Apache, NCSA • Thread-based: JAWS, IIS • Event-based: Flash, Zeus • Kernel-based: Tux, AFPA, ExoKernel We will describe the advantages and disadvantages of each. Fundamental tradeoffs exist between performance, protection, sharing, robustness, extensibility, etc. Erich Nahum

Process Model (ex: Apache) • Process created to handle each new request: • Process can block on appropriate actions, (e.g., socket read, file read, socket write) • Concurrency handled via multiple processes • Quickly becomes unwieldy: • Process creation is expensive. • Instead, pre-forked pool is created. • Upper limit on # of processes is enforced • First by the server, eventually by the operating system. • Concurrency is limited by upper bound Erich Nahum

Process Model: Pros and Cons • Advantages: • Most importantly, consistent with programmer's way of thinking. Most programmers think in terms of linear series of steps to accomplish task. • Processes are protected from one another; can't nuke data in some other address space. Similarly, if one crashes, others unaffected. • Disadvantages: • Slow. Forking is expensive, allocating stack, VM data structures for each process adds up and puts pressure on the memory system. • Difficulty in sharing info across processes. • Have to use locking. • No control over scheduling decisions. Erich Nahum

Thread Model (Ex: JAWS) • Use threads instead of processes. Threads consume fewer resources than processes (e.g., stack, VM allocation). • Forking and deleting threads is cheaper than processes. • Similarly, pre-forked thread pool is created. May be limits to numbers but hopefully less of an issue than with processes since fewer resources required. Erich Nahum

Thread Model: Pros and Cons • Advantages: • Faster than processes. Creating/destroying cheaper. • Maintains programmer's way of thinking. • Sharing is enabled by default. • Disadvantages: • Less robust. Threads not protected from each other. • Requires proper OS support, otherwise, if one thread blocks on a file read, will block all the address space. • Can still run out of threads if servicing many clients concurrently. • Can exhaust certain per-process limits not encountered with processes (e.g., number of open file descriptors). • Limited or no control over scheduling decisions. Erich Nahum

Event Model (Ex: Flash) while (1) { accept new connections until none remaining; call select() on all active file descriptors; for each FD: if (fd ready for reading) call read(); if (fd ready for writing) call write(); } • Use a single process and deal with requests in a event-driven manner, like a giant switchboard. • Use non-blocking option (O_NDELAY) on sockets, do everything asynchronously, never block on anything, and have OS notify us when something is ready. Erich Nahum

Event-Driven: Pros and Cons • Advantages: • Very fast. • Sharing is inherent, since there’s only one process. • Don't even need locks as in thread models. • Can maximize concurrency in request stream easily. • No context-switch costs or extra memory consumption. • Complete control over scheduling decisions. • Disadvantages: • Less robust. Failure can halt whole server. • Pushes per-process resource limits (like file descriptors). • Not every OS has full asynchronous I/O, so can still block on a file read. Flash uses helper processes to deal with this (AMPED architecture). Erich Nahum

In-Kernel Model (Ex: Tux) HTTP HTTP SOCK TCP TCP IP IP ETH ETH user/ kernel boundary user/ kernel boundary • Dedicated kernel thread for HTTP requests: • One option: put whole server in kernel. • More likely, just deal with static GET requests in kernel to capture majority of requests. • Punt dynamic requests to full-scale server in user space, such as Apache. user-space server kernel-space server Erich Nahum

In-Kernel Model: Pros and Cons • In-kernel event model: • Avoids transitions to user space, copies across u-k boundary, etc. • Leverages already existing asynchronous primitives in the kernel (kernel doesn't block on a file read, etc.) • Advantages: • Extremely fast. Tight integration with kernel. • Small component without full server optimizes common case. • Disadvantages: • Less robust. Bugs can crash whole machine, not just server. • Harder to debug and extend, since kernel programming required, which is not as well-known as sockets. • Similarly, harder to deploy. APIs are OS-specific (Linux, BSD, NT), whereas sockets & threads are (mostly) standardized. • HTTP evolving over time, have to modify kernel code in response. Erich Nahum

Web Servers: Implementation and Performance