measurement modeling and analysis of the internet part ii n.
Skip this Video
Loading SlideShow in 5 Seconds..
Measurement, Modeling, and Analysis of the Internet: Part II PowerPoint Presentation
Download Presentation
Measurement, Modeling, and Analysis of the Internet: Part II

Measurement, Modeling, and Analysis of the Internet: Part II

226 Vues Download Presentation
Télécharger la présentation

Measurement, Modeling, and Analysis of the Internet: Part II

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Measurement, Modeling, and Analysis of the Internet: Part II Vishal Misra Dept. of Computer Science Columbia University in the City of New York

  2. Overview • Traffic Modeling • TCP Modeling and Congestion Control • Topology Modeling

  3. Part II.a: Traffic modeling

  4. Traffic Modeling • Early modeling efforts: legacy of telephony • Packet arrivals: Call arrivals (Poisson) • Exponential holding times • Big Bang in 1993 • “On the Self-Similar Nature of Ethernet Traffic”Will E. Leland, Walter Willinger, Daniel V. Wilson, Murad S. Taqqu

  5. Self-Similarity in Traffic Measurement(Ⅱ) Network Traffic

  6. Extract from abstract “We demonstrate that Ethernet local area network (LAN) traffic is statistically self-similar, that none of the commonly used traffic models is able to capture this fractal behavior, that such behavior has serious implications for the design, control, and analysis of high-speed…” That Changed Everything…..

  7. Properties of Self-Similarity • Var(X(m) ) (= 2 m-β ) decreases more slowly (than m –1) • r(k) decreases hyperbolically (not exponentially) so that kr(k) =  (long range dependence) • The spectral density [discrete time Fourier Transform of r(k)] f(λ) cλ-(1- β), as λ0 (not bounded)

  8. What went wrong? What next? • Modelers realized Calls->Packets mapping inherently wrong • Self-similarity, or more accurately LRD evidenced by Burstiness of traffic • Explanations for LRD were sought and modeled • [LWWT] postulated heavy tails somewhere as likely cause of LRD

  9. Explanations of LRD • Open loop models • Closed loop models • Mixed or structural models

  10. Open loop models

  11. Cox’s construction • Aggregate traffic is made up of many connections • Connections arrive at random • Each connection has a “size” (number of packets) • Each connection transmits packets at some “rate” • Heavy tailed distribution of size can cause LRD traffic

  12. M/G/  traffic model • M/G/ traffic model • Poisson customer arrivals • Heavy tailed service times • Paretotypical distribution • Traffic  number of busy servers

  13. Where are the heavy tails though… • Construction provided generative model for traffic • Still didn’t explain where the heavy tails were coming from.. • …until 1997 • “Self-similarity in World Wide Web traffic. Evidence and possible causes.” Mark E. Crovella and Azer Bestavros. • Postulated that web file sizes follow Pareto distribution

  14. Crovella dataset

  15. Picture seemed complete.. • Generative model existed • Heavy tails were found • Performance analysts got to work • Simulations based on generative model • Analysis of multiplexers fed with traffic model • Grave predictions on buffer overflow sprung • Conservative buffer dimensioning was advocated • …but real world systems performed much better

  16. Problems with open loop models • Upwards of 90% network traffic closed loop • Transmission of future packets depends on what happened to prior packets • Buffer overflows cause senders to back off/reduce rate, thereby affecting generation of packets • Open loop models ignored the network effects • Simulation/Analysis results misleading with open loop models

  17. Closed loop models

  18. Why is closed loop important? • Recall.. “Transmission of future packets depends on what happened to prior packets” • Suggests closed loop behavior induces correlations independently of file size distribution

  19. Chaos? • “The chaotic nature of TCP congestion control” A. Veres and M. Boda, Infocom 2000 (winner best paper award) • Paper simulated TCP sources sharing a link and observed chaotic dynamics

  20. Chaotic dynamics Onset of “chaos” depended on B/N ratio (B = Buffer size, N = number of flows)

  21. Chaos continued.. • Paper generated traffic, and preliminary analysis demonstrated presence of LRD • LRD completely determined by TCP, no role of variability of filesizes • Do the claims hold up?

  22. Verification of TCP induced LRD

  23. Another TCP based model • “On the Propagation of Long-Range Dependence in the Internet” A. Veres, Zs. Kenesi, S. Molnár, G. Vattay Sigcomm 2000 • Proposed the theory that TCP can get “infected” by long range dependence and then “spread” the infection

  24. Model • Let F* be an LRD flow, sharing a link C1with a TCP flow T1 • Since TCP adapts to available capacity • T1= C1 - F* • Implies T1becomes LRD (linearity and C1 is a constant) • Now T1shares link C2 with TCP flow T2 • T2 = C2 - T1 • Since T1 has been established LRD, T2 now becomes LRD • And so on… • Model has too many technical flaws to point out..

  25. Combined (structural) models

  26. Recent (and not so) thoughts on traffic modeling • Observation: Internet protocol hierarchy is layered • Different layers act at different timescales • Layering can lead to multiple timescale (and hence LRD) behavior • Short time scale(multi-fractal) behavior can be quite different from long time scale (mono-fractal)

  27. From traces to traffic models • Implicit assumptions behind application modeling techniques: • Identify the application corresponding to a given flow recorded during a measurement period • Identify traffic generated by (instances) of the same application • Operation of the application-level protocol

  28. Example of web traffic modeling • Primary random variables: • Request sizes/Reply sizes • User think time • Persistent connection usage • Nbr of objects per persistent connection • Number of embedded images/page • Number of parallel connections • Consecutive documents per server • Number of servers per page

  29. Consider independent Markov on-off processes

  30. Spectrum Indistinguishable! LRD Markovian On-Off Product Of 3 Mark. On-Off Product of 2 Mark. On-Off Wavelet plot (PSD) of LRD vs Markovian

  31. Relating layers to traffic generation Session layer behavior application layer behavior Transport layer behavior Packet generated when all layers are “on”, i.e resultant process is product of component layers

  32. The thousand word picture

  33. Part II.b: Fluid modeling of TCP

  34. Outline • Background • Stochastic Fluid Model • Deterministic Fluid Models • Control theoretic analysis • Delay, stability • Some limiting fluid models

  35. TCP Congestion Control: window algorithm • Window: can send W packets at a time • increase window by one per RTT if no loss, W <- W+1 each RTT • decrease window by half on detection of loss W <- W/2

  36. receiver W sender TCP Congestion Control: window algorithm Window: can send W packets • increase window by one per RTT if no loss, W <- W+1 each RTT • decrease window by half on detection of loss W <- W/2

  37. receiver W sender TCP Congestion Control: window algorithm • Window: can send W packets • increase window by one per RTT if no loss, W <- W+1 each RTT • decrease window by half on detection of loss W <- W/2

  38. Background: • TCP throughput modeling: hot research topic in the late 90s • Earliest work by Teunis Ott (Bellcore) • Steady state analysis of TCP throughput using time rescaling • Padhye et al. (UMass, Sigcomm98) obtained accurate throughput formula for TCP • Formula validated with real Internet traces • Traces contained loss events

  39. Loss modeling • What do losses in a wide area experiment look like? • First guess: is the loss process Poisson? • Analyze traces: several independent experiments, duration 100 seconds each.

  40. Trace analysis Loss inter arrival events tested for • Independence • Lewis and Robinson test for renewal hypothesis • Exponentiality • Anderson-Darling test

  41. Scatter plot of statistic

  42. Experiment 1

  43. Experiment 2

  44. Experiment 3

  45. Experiment 4

  46. Loss Indications arrival rate l Traditional, Source centric loss model New, Network centric loss model Sender New loss model proposed in “Stochastic Differential Equation Modeling and Analysis of TCP Window size behavior”, Misra et. al. Performance 99. Sender Loss Probability pi Loss model enabled casting of TCP behavior as a Stochastic Differential Equation, roughly SDE based model

  47. Networkis a (blackbox) source of R and l R l l Solution: Express R and l as functions of W (and N, number of flows) R Network Refinement of SDE model Window Size is a function of loss rate (l) and round trip time (R) W(t) = f(l,R)

  48. Active Queue Management:RED • RED: Random Early Detect proposed in 1993 • Proactively mark/drop packets in a router queue probabilistically to • Prevent onset of congestion by reacting early • Remove synchronization between flows

  49. - q (t) -x (t) t -> The RED mechanism RED: Marking/dropping based on average queue length x (t) (EWMA algorithm used for averaging) 1 Marking probability p pmax tmin tmax 2tmax Average queue length x x(t): smoothed, time averaged q(t)

  50. Packet Drop/Mark Round Trip Delay (t) Loss Model AQM Router B(t) p(t) Sender Receiver Loss Rate as seen by Sender: B(t-t)*p(t-t) = l(t) l(t)dt=E[dN(t)] -> deterministic fluid model