410 likes | 423 Vues
Deconstructing SPECweb99. Erich Nahum. IBM T.J. Watson Research Center www.research.ibm.com/people/n/nahum nahum@us.ibm.com. Talk Overview. Workload Generators SPECweb99 Methodology Results Summary and Conclusions. Why Workload Generators?.
E N D
Deconstructing SPECweb99 Erich Nahum IBM T.J. Watson Research Center www.research.ibm.com/people/n/nahum nahum@us.ibm.com Erich Nahum
Talk Overview • Workload Generators • SPECweb99 • Methodology • Results • Summary and Conclusions Erich Nahum
Why Workload Generators? • Allows stress-testing and bug-finding • Gives us some idea of server capacity • Allows us a scientific process to compare approaches • e.g., server models, gigabit adaptors, OS implementations • Assumption is that difference in testbed translates to some difference in real-world • Allows the performance debugging cycle Measure Reproduce Fix and/or improve Find Problem The Performance Debugging Cycle Erich Nahum
How does W. Generation Work? • Many clients, one server • match asymmetry of Internet • Server is populated with some kind of synthetic content • Simulated clients produce requests for server • Master process to control clients, aggregate results • Goal is to measure server • not the client or network • Must be robust to conditions • e.g., if server keeps sending 404 not found, will clients notice? Requests Responses Erich Nahum
Problems with Workload Generators • Only as good as our understanding of the traffic • Traffic may change over time • generators must too • May not be representative • e.g., are file size distributions from IBM.com similar to mine? • May be ignoring important factors • e.g., browser behavior, WAN conditions, modem connectivity • Still, useful for diagnosing and treating problems Erich Nahum
What Server Workload Generators Exist? • Many. In order of publication: • WebStone (SGI) • SPECweb96 (SPEC) • Scalable Client (Rice Univ.) • SURGE (Boston Univ.) • httperf (HP Labs) • SPECweb99 (SPEC) • TPC-W (TPC) • WaspClient (IBM) • WAGON (IBM) • Not to mention those for proxies (e.g. polygraph) • Focus of this talk: SPECweb99 Erich Nahum
Why SPECweb99? • Has become the de-facto standard used in Industry: • 141 submissions in 3 years on the SPEC web site • Hardware: Compaq, Dell, Fujitsu, HP, IBM, Sun • OS’es: AIX, HPUX, Linux, Solaris, Windows NT • Servers: Apache, IIS, Netscape, Tux, Zeus • Used within corporations for performance, testing, and marketing • E.g., within IBM, used by AIX, Linux, and 390 groups • Begs the question: how realistic is it? Erich Nahum
Server Workload Characterization • Over the years, many observations have been made about Web server behavior: • Request methods • Response codes • Document Popularity • Document Sizes • Transfer Sizes • Protocol use • Inter-arrival times How well does SPECweb99 capture these characteristics? Erich Nahum
History: SPECweb96 • SPEC: Systems Performance Evaluation Consortium • Non-profit group with many benchmarks (CPU, FS) • Pay for membership, get source code • First attempt to get somewhat representative • Based on logs from NCSA, HP, Hal Computers • 4 classes of files: • Poisson distribution within each class Erich Nahum
SPECweb96 (cont) • Notion of scaling versus load: • number of directories in data set size doubles as expected throughput quadruples (sqrt(throughput/5)*10) • requests spread evenly across all application directories • Process based WG • Clients talk to master via RPC's • Does only GETS, no keep-alive www.spec.org/osg/web96 Erich Nahum
Evolution: SPECweb99 • In response to people "gaming" benchmark, now includes rules: • IP maximum segment lifetime (MSL) must be at least 60seconds • Link-layer maximum transmission unit (MTU) must not be larger than 1460bytes (Ethernet frame size) • Dynamic content may not be cached • not clear that this is followed • Servers must log requests. • W3C common log format is sufficient but not mandatory. • Resulting workload must be within 10% of target. • Error rate must be below 1%. • Metric has changed: • now "number of simultaneous conforming connections“:rate of a connection must be greater than 320 Kbps Erich Nahum
SPECweb99 (cont) • Directory size has changed: (25 + (400000/122000)* simultaneous conns) / 5.0) • Improved HTTP 1.0/1.1 support: • Keep-alive requests (client closes after N requests) • Cookies • Back-end notion of user demographics • Used for ad rotation • Request includes user_id and last_ad • Request breakdown: • 70.00 % static GET • 12.45 % dynamic GET • 12.60 % dynamic GET with custom ad rotation • 04.80 % dynamic POST • 00.15 % dynamic GET calling CGI code Erich Nahum
SPECweb99 (cont) • Other breakdowns: • 30 % HTTP 1.0 with no keep-alive or persistence • 70 % HTTP 1.1 with keep-alive to "model" persistence • still has 4 classes of file size with Poisson distribution • supports Zipf popularity • Client implementation details: • Master-client communication uses sockets • Code includes sample Perl code for CGI • Client configurable to use threads or processes • Much more info on setup, debugging, tuning • All results posted to web page, • including configuration & back end code www.spec.org/osg/web99 Erich Nahum
Methodology • Take a log from a large-scale SPECweb99 run • Take a number of available server logs • For each characteristic discussed in the literature: • Show what SPECweb99 does • Compare to results from the literature • Compare to results from a set of sample server logs • Render judgment on how well SPECweb99 does Erich Nahum
Sample Logs for Illustration We’ll use statistics generated from these logs as examples. Erich Nahum
Talk Overview • Workload Generators • SPECweb99 • Methodology • Results • Summary and Conclusions Erich Nahum
Request Methods • AW96, AW00, PQ00, KR01: majority are GETs, few POSTs • SPECweb99: No HEAD request, too many POSTS Erich Nahum
Response Codes • AW96, AW00, PQ00, KR01: Most are 200s, next 304’s • SPECweb99 doesn’t capture anything but 200 OK Erich Nahum
Resource Popularity • p(r) = C/r^alpha (alpha = 1 true Zipf; others “Zipf-like") • Consistent with CBC95, AW96, CB96, PQ00, KR01 • SPECweb99 does a good job here with alpha = 1 Erich Nahum
Resource (File) Sizes • Lognormal body, consistent with results from AW96, CB96, KR01. • SPECweb99 curve is sparse, 4 distinct regions Erich Nahum
Tails of the File Size • AW96, CB96: sizes have Pareto tail; Downey01: Sizes are lognormal. • SPECweb99 tail only goes to 900 KB (vs 10 MB for others) Erich Nahum
Response (Transfer) Sizes • Lognormal body, consistent with CBC95, AW96, CB96, KR01 • SPECweb99 doesn’t capture zero-byte transfers (304s) Erich Nahum
Transfer Sizes w/o 304’s • When 304’s removed, SPECweb99 much closer Erich Nahum
Tails of the Transfer Size • SPECweb99 tail is neither lognormal nor pareto • Again, max transfer is only 900 KB Erich Nahum
Inter-Arrival Times • Literature gives exponential distr. for session arrivals • KR01: Request inter-arrivals are pareto • Here we look at request inter-arrivals Erich Nahum
Tails of Inter-Arrival Times • SPECweb99 has pareto tail • Not all others do, but may be due to truncation • (e.g. log duration of only one day) Erich Nahum
HTTP Version • Over time, more and more requests are served using 1.1 • But SPECweb99 is much higher than any other log • Literature doesn’t look at this, so no judgments Erich Nahum
Summary and Conclusions • SPECweb99 has a mixed record depending on characteristic: • Methods: OK • Response codes: bad • Document popularity: good • File sizes: OK to bad • Transfer sizes: bad • Inter-arrival times: good • Main problems are: • Needs to capture conditional GETs with IMS for 304’s • Better file size distribution (smoother, larger) Erich Nahum
Future Work • Several possibilities for future work: • Compare logs with SURGE • More detail on HTTP 1.1 (requires better workload characterization, e.g. packet traces) • Dynamic content (e.g., TPC-W) (again, requires workload characterization) • Latter 2 will not be easy due to privacy, competitive concerns Erich Nahum
Probability • Graph shows 3 distributions with average = 2. • Note average median in some cases ! • Different distributions have different “weight” in tail. Erich Nahum
Important Distributions Some Frequently-Seen Distributions: • Normal: • (avg. sigma, variance mu) • Lognormal: • (x >= 0; sigma > 0) • Exponential: • (x >= 0) • Pareto: • (x >= k, shape a, scale k) Erich Nahum
Probability Refresher • Lots of variability in workloads • Use probability distributions to express • Want to consider many factors • Some terminology/jargon: • Mean: average of samples • Median : half are bigger, half are smaller • Percentiles: dump samples into N bins (median is 50th percentile number) • Heavy-tailed: • As x->infinity Erich Nahum
Session Inter-Arrivals • Inter-arrival time between successive requests • “Think time" • difference between user requests vs. ALL requests • partly depends on definition of boundary • CB96: variability across multiple timescales, "self-similarity", average load very different from peak or heavy load • SCJO01: log-normal, 90% less than 1 minute. • AW96: independent and exponentially distributed • KR01: session arrivals follow poisson distribution, but requests follow pareto with a=1.5 Erich Nahum
Protocol Support • IBM.com 2001 logs: • Show roughly 53% of client requests are 1.1 • KA01 study: • 92% of servers claim to support 1.1 (as of Sep 00) • Only 31% actually do; most fail to comply with spec • SCJO01 show: • Avg 6.5 requests per persistent connection • 65% have 2 connections per page, rest more. • 40-50% of objects downloaded by persistent connections Appears that we are in the middle of a slow transition to 1.1 Erich Nahum
WebStone • The original workload generator from SGI in 1995 • Process based workload generator, implemented in C • Clients talk to master via sockets • Configurable: # client machines, # client processes, run time • Measured several metrics: avg + max connect time, response time, throughput rate (bits/sec), # pages, # files • 1.0 only does GETS, CGI support added in 2.0 • Static requests, 5 different file sizes: www.mindcraft.com/webstone Erich Nahum
SURGE • Scalable URL Reference GEnerator • Barford & Crovella at Boston University CS Dept. • Much more worried about representativeness, captures: • server file size distributions, • request size distribution, • relative file popularity • embedded file references • temporal locality of reference • idle periods ("think times") of users • Process/thread based WG Erich Nahum
SURGE (cont) • Notion of “user-equivalent”: • statistical model of a user • active “off” time (between URLS), • inactive “off” time (between pages) • Captures various levels of burstiness • Not validated, shows that load generated is different than SpecWeb96 and has more burstiness in terms of CPU and # active connections www.cs.wisc.edu/~pb Erich Nahum
S-Client • Almost all workload generators are closed-loop: • client submits a request, waits for server, maybe thinks for some time, repeat as necessary • Problem with the closed-loop approach: • client can't generate requests faster than the server can respond • limits the generated load to the capacity of the server • in the real world, arrivals don’t depend on server state • i.e., real users have no idea about load on the server when they click on a site, although successive clicks may have this property • in particular, can't overload the server • s-client tries to be open-loop: • by generating connections at a particular rate • independent of server load/capacity Erich Nahum
S-Client (cont) • How is s-client open-loop? • connecting asynchronously at a particular rate • using non-blockingconnect() socket call • Connect complete within a particular time? • if yes, continue normally. • if not, socket is closed and new connect initiated. • Other details: • uses single-address space event-driven model like Flash • calls select() on large numbers of file descriptors • can generate large loads • Problems: • client capacity is still limited by active FD's • “arrival” is a TCP connect, not an HTTP request www.cs.rice.edu/CS/Systems/Web-measurement Erich Nahum
TPC-W • Transaction Processing Council (TPC-W) • More known for database workloads like TPC-D • Metrics include dollars/transaction (unlike SPEC) • Provides specification, not source • Meant to capture a large e-commerce site • Models online bookstore • web serving, searching, browsing, shopping carts • online transaction processing (OLTP) • decision support (DSS) • secure purchasing (SSL), best sellers, new products • customer registration, administrative updates • Has notion of scaling per user • 5 MB of DB tables per user • 1 KB per shopping item, 25 KB per item in static images Erich Nahum
TPC-W (cont) • Remote browser emulator (RBE) • emulates a single user • send HTTP request, parse, wait for thinking, repeat • Metrics: • WIPS: shopping • WIPSb: browsing • WIPSo: ordering • Setups tend to be very large: • multiple image servers, application servers, load balancer • DB back end (typically SMP) • Example: IBM 12-way SMP w/DB2, 9 PCs w/IIS: 1M $ www.tpc.org/tpcw Erich Nahum