Engineering for QoS and the limits of service differentiation

IWQoS June 2000 Engineering for QoS and the limits of service differentiation Jim Roberts (james.roberts@francetelecom.fr)

The central role of QoS feasible technology • quality of service • transparency • response time • accessibility • service model • resource sharing • priorities,... • network engineering • provisioning • routing,... a viable business model

Engineering for QoS: a probabilistic point of view • statistical characterization of traffic • notions of expected demand and random processes • for packets, bursts, flows, aggregates • QoS in statistical terms • transparency: Pr [packet loss], mean delay, Pr [delay > x],... • response time: E [response time],... • accessibility: Pr [blocking],... • QoS engineering, based on a three-way relationship: demand performance capacity

Outline • traffic characteristics • QoS engineering for streaming flows • QoS engineering for elastic traffic • service differentiation

Internet traffic is self-similar • a self-similar process • variability at all time scales • due to: • infinite variance of flow size • TCP induced burstiness • a practical consequence • difficult to characterize a traffic aggregate Ethernet traffic, Bellcore 1989

Traffic on a US backbone link(Thomson et al, 1997) • traffic intensity is predictable ... • ... and stationary in the busy hour

Traffic on a French backbone link • traffic intensity is predictable ... • ... and stationary in the busy hour tue wed thu fri sat sun mon 12h 18h 00h 06h

IP flows • a flow = one instance of a given application • a "continuous flow" of packets • basically two kinds of flow, streaming and elastic • streaming flows • audio and video, real time and playback • rate and duration are intrinsic characteristics • not rate adaptive (an assumption) • QoS  negligible loss, delay, jitter • elastic flows • digital documents ( Web pages, files, ...) • rate and duration are measures of performance • QoS  adequate throughput (response time)

Flow traffic characteristics • streaming flows • constant or variable rate • compressed audio (O[103 bps]) • compressed video (O[106 bps]) • highly variable duration • a Poisson flow arrival process (?) • elastic flows • infinite variance size distribution • rate adaptive • a Poisson flow arrival process (??) variable rate video

Modelling traffic demand • stream traffic demand • arrival rate x bit rate x duration • elastic traffic demand • arrival rate x size • a stationary process in the "busy hour" • eg, Poisson flow arrivals, independent flow size traffic demand Mbit/s busy hour time of day

Open loop control for streaming traffic • Open loop control -- a "traffic contract" • QoS guarantees rely on • traffic descriptors + admission control + policing • time scale decomposition for performance analysis • packet scale • burst scale • flow scale user-network interface user-network interface network-network interface

Packet scale: a superposition of constant rate flows • constant rate flows • packet size/inter-packet interval = flow rate • maximum packet size = MTU • buffer size for negligible overflow? • over all phase alignments... • assuming independence between flows • worst case assumptions: • many low rate flows • MTU-sized packets •  buffer sizing for M/DMTU/1 queue • Pr [queue > x] ~ C e -r x buffer size M/DMTU/1 increasing number, increasing pkt size log Pr [saturation]

The "negligible jitter conjecture" • constant rate flows acquire jitter • notably in multiplexer queues • conjecture: • if all flows are initially CBR and in all queues: S flow rates < service rate • they never acquire sufficient jitter to become worse for performance than a Poisson stream of MTU packets • M/DMTU/1 buffer sizing remains conservative

packets bursts arrival rate Burst scale: fluid queueing models • assume flows have an instantaneous rate • eg, rate of on/off sources • bufferless or buffered multiplexing • Pr [arrival rate > service rate] < e • E [arrival rate] < service rate

buffer size 0 0 log Pr [saturation] Buffered multiplexing performance: impact of burst parameters Pr [rate overload]

buffer size 0 0 log Pr [saturation] Buffered multiplexing performance: impact of burst parameters longer burst length shorter

buffer size 0 0 log Pr [saturation] Buffered multiplexing performance: impact of burst parameters more variable burst length less variable

buffer size 0 0 log Pr [saturation] Buffered multiplexing performance: impact of burst parameters long range dependence burst length short range dependence

Choice of token bucket parameters? • the token bucket is a virtual Q • service rate r • buffer size b • non-conformance depends on • burst size and variability • and long range dependence • a difficult choice for conformance • r >> mean rate... • ...or b very large b b' non- conformance probability r b

time Bufferless multiplexing: alias "rate envelope multiplexing" • provisioning and admission control to ensure Pr [Lt>C] < e • performance depends only on stationary rate distribution • loss rate  E [(Lt -C)+] / E [Lt] • insensitivity to self-similarity output rate C combined input rate Lt

Efficiency of bufferless multiplexing • small amplitude of rate variations ... • peak rate << link rate (eg, 1%) • ... or low utilisation • overall mean rate << link rate • we may have both in an integrated network • priority to streaming traffic • residue shared by elastic flows

Flow scale: admission control • accept new flow only if transparency preserved • given flow traffic descriptor • current link status • no satisfactory solution for buffered multiplexing • (we do not consider deterministic guarantees) • unpredictable statistical performance • measurement-based control for bufferless multiplexing • given flow peak rate • current measured rate (instantaneous rate, mean, variance,...) • uncritical decision threshold if streaming traffic is light • in an integrated network

utilization (r=a/m) for E(m,a) = 0.01 r 0.8 0.6 0.4 0.2 m 0 20 40 60 80 100 Provisioning for negligible blocking • "classical" teletraffic theory; assumeM/M/m/m • Poisson arrivals, rate l • constant rate per flow r • mean duration 1/m • mean demand, A = (l/m)r bits/s • blocking probability for capacity C • B = E(C/r,A/r) • E(m,a) is Erlang's formula: • E(m,a)= • m=C/r, a=A/r • scale economies • generalizations exist: • for different rates • for variable rates

Closed loop control for elastic traffic • reactive control • end-to-end protocols (eg, TCP) • queue management • time scale decomposition for performance analysis • packet scale • flow scale

Packet scale: bandwidth and loss rate • a multi-fractal arrival process • but loss and bandwidth related by TCP (cf. Padhye et al.) • thus, p = B-1(p): ie, loss rate depends on bandwidth share (B ~ p-1/2) ? congestion avoidance loss rate p B(p)

Packet scale: bandwidth sharing • reactive control (TCP, scheduling) shares bottleneck bandwidth unequally • depending on RTT, protocol implementation, etc. • and differentiated services parameters • optimal sharing in a network: objectives and algorithms... • max-min fairness, proportional fairness, max utility,... • ... but response time depends more on traffic process than the static sharing algorithm! Example: a linear network route 0 route 1 route L

Flow scale: performance of a bottleneck link link capacity C • assume perfect fair shares • link rate C, n elastic flows  • each flow served at rate C/n • assume Poisson flow arrivals • an M/G/1 processor sharing queue • load, r = arrival rate x size / C • performance insensitive to size distribution • Pr [n transfers] = rn(1-r) • E [response time] = size / C(1-r) • instability if r > 1 • i.e., unbounded response time • stabilized by aborted transfers... • ... or by admission control fair shares  a processor sharing queue Throughput C r 0 0 1

transfer 1-p flows Poisson session arrivals processor sharing p think time infinite server Generalizations of PS model • non-Poisson arrivals • Poisson sessions • Bernoulli feedback • discriminatory processor sharing • weight fi for class i flows • service rate fi • rate limitations (same for all flows) • maximum rate per flow (eg, access rate) • minimum rate per flow (by admission control)

Admission control can be useful ... to prevent disasters at sea !

Admission control can also be useful for IP flows • improve efficiency of TCP • reduce retransmissions overhead ... • ... by maintaining throughput • prevent instability • due to overload (r > 1)... • ...and retransmissions • avoid aborted transfers • user impatience • "broken connections" • a means for service differentiation...

1 .8 .6 .4 .2 0 300 200 100 0 Blocking probability E [Response time]/size r = 1.5 r = 1.5 r = 0.9 r = 0.9 0 100 200 N 0 100 200 N Choosing an admission control threshold • N = the maximum number of flows admitted • negligible blocking when r<1, maintain quality when r>1 • M/G/1/N processor sharing system • min bandwidth = C/N • Pr [blocking] = rN(1 - r)/(1 - rN+1)  (1 - 1/r) , for r>1 • uncritical choice of threshold • eg, 1% of link capacity (N=100)

throughput C backbone link (rate C) access rate access links (rate<<C) 0 0 r 1 Impact of access rate on backbone sharing • TCP throughput is limited by access rate... • modem, DSL, cable • ... and by server performance •  backbone link is a bottleneck only if saturated! • ie, if r > 1

Provisioning for negligible blocking for elastic flows • "elastic" teletraffic theory; assume M/G/1/m • Poisson arrivals, rate l • mean size s • blocking probability for capacity C • utilization r = ls/C • m = admission ctl limit • B(r,m) =rm(1-r)/(1-rm+1) • impact of access rate • C/access rate = m • B(r,m) E(m,rm)- Erlang utilization (r) for B = 0.01 r 0.8 E(m,rm) 0.6 0.4 0.2 m 0 20 40 60 80 100

Service differentiation • discriminating between stream and elastic flows • transparency for streaming flows • response time for elastic flows • discriminating between stream flows • different delay and loss requirements • ... or the best quality for all? • discriminating between elastic flows • different response time requirements • ... but how?

Integrating streaming and elastic traffic • priority to packets of streaming flows • low utilization  negligible loss and delay • elastic flows use all remaining capacity • better response times • per flow fair queueing (?) • to prevent overload • flow based admission control... • ...and adaptive routing • an identical admission criterion for streaming and elastic flows • available rate > R elastic streaming

Different accessibility • block class 1 when N1=100 flows in progress - block class 2 when N2 flows in progress • class 1: higher priority than class 2 1 0 Blocking probability r = 1.5 r = 0.9 0 100 200 N

1 1 1 B2 r1 = r2 = 0.4 r1 = r2 = 0.6 r1 = r2 = 1.2 B2 .33 B1 .17 B2B10 B1 0 0 0 100 N2 N2 0 0 100 N2 Different accessibility • block class 1 when N1=100 flows in progress - block class 2 when N2=50 flows in progress • underload: both classes have negligible blocking (B1» B2» 0) • in overload: discrimination is effective • if r1 < 1 < r1 + r2, B1» 0, B2» (r1+r2-1)/r2 • if 1 < r1, r2 B1» (r1-1)/r1, B2» 1

Service differentiation and pricing • different QoS requires different prices... • or users will always choose the best • ...but streaming and elastic applications are qualitatively different • choose streaming class for transparency • choose elastic class for throughput •  no need for streaming/elastic price differentiation ? • different prices exploit different "willingness to pay"... • bringing greater economic efficiency • ...but QoS is not stable or predictable • depends on route, time of day,.. • and on factors outside network control: access, server, other networks,... •  network QoS is not a sound basis for price discrimination

capacity demand demand capacity $$$ time of day time of day Pricing to pay for the network • fix a price per byte • to cover the cost of infrastructure and operation • estimate demand • at that price • provision network to handle that demand • with excellent quality of service optimal price  revenue = cost $$$

Outline • traffic characteristics • QoS engineering for streaming flows • QoS engineering for elastic traffic • service differentiation • conclusions

C r 0 0 1 Conclusions • a statistical characterization of demand • a stationary random process in the busy period • a flow level characterization (streaming and elastic flows) • transparency for streaming flows • rate envelope ("bufferless") multiplexing • the "negligible jitter conjecture" • response time for elastic flows • a "processor sharing" flow scale model • instability in overload (i.e., E [demand]> capacity) • service differentiation • distinguish streaming and elastic classes • limited scope for within-class differentiation • flow admission control in case of overload

Engineering for QoS and the limits of service differentiation

Engineering for QoS and the limits of service differentiation

Presentation Transcript

Engineering for QoS and the limits of service differentiation

Internet Quality-of-Service (QoS)

Performance Measurements of MPLS Traffic Engineering and QoS

QoS Quality of Service

A Summary of Engineering Rules for ATM Network Dimensioning and QoS

Quality of Service (QoS)

SECURITY, QoS, and (File) Content Differentiation

Service as a source of differentiation and value for BTW

Architectures for Quality of Service (QoS) in the Internet

Quality of Service - QoS

Service Differentiation and Grids

Quality of Service(QoS)

Quality of Service (QoS)

Quality of Service Differentiation

service differentiation

Extended QoS Authorization for the QoS NSLP

Quality-Of-Service (QoS) Panel