Griffin Update: Towards an Agile, Predictive Infrastructure

Griffin Update: Towards an Agile, Predictive Infrastructure Anthony D. Joseph UC Berkeley http://www.cs.berkeley.edu/~adj/ Sahara Retreat, January 2003

Outline • Griffin • Motivation • Goals • Components • Tapas Update • Motivation • Data preconditioning-based network modeling • Model accuracy issues and validation • Domain analysis

Near-Continuous, Highly-Variable Internet Connectivity • Connectivity everywhere: campus, in-building, satellite… • Projects: Sahara (01-), Iceberg (98-01), Rover (95-97) • Most applications support limited variability (1% to 2x) • Design environment for legacy apps is static desktop LAN • Strong abstraction boundaries (APIs) hide the # of RPCs • But, today’s apps see a wider range of variability • 35 orders of magnitude of bandwidth from 10's Kb/s 1 Gb/s • 46 orders of magnitude of latency from 1 sec 1,000's ms • 59 orders of magnitude of loss rates from 10-3 10-12 BER • Neither best-effort or unbounded retransmission may be ideal • Also, overloaded servers / limited resources on mobile devices • Result: Poor/variable performance from legacy apps

Griffin Goals • Users always see excellent ( local, lightly loaded) application behavior and performance • Independent of the current infrastructure conditions • Move away from “reactive to change” model • Agility: key metric is time to react and adapt • Help legacy applications handle changing conditions • Analyze, classify, and predict behavior • Pre-stage dynamic/static code/data (activate on demand) • Architecture for developing new applications • Input/control mechanisms for new applications • Application developer tools • Leverage Sahara policies and control mechanisms

Griffin: An Adaptive, Predictive Approach • Continuous, cross-layer, multi-timescale introspection • Collect & cluster link, network, and application protocol events • Broader-scale: Correlate AND communicate short-/long-term events and effects at multiple levels (breaks abstractions) • Challenge: Building accurate models of correlated events • Convey app reqs/network info to/from lower-levels • Break abstraction boundaries in a controlled way • Challenge: Extensible interfaces to avoid existing least common denominator problems • Overlay more powerful network model on top of IP • Avoid standardization delays/inertia • Enables dynamic service placement • Challenge: Efficient interoperation with IP routing policies

Some Enabling Infrastructure Components • Tapas network characteristics toolkit • Measuring/modeling/emulating/predicting delay, loss, … • Provides Sahara with micro-scale network weather information • Mechanism for monitoring/predicting available QoS • REAP protocol modifying / application building toolkit • Introspective mobile code/data support for legacy / new apps • Provides dynamic placement of data and service components • MINO E-mail application on OceanStore / Planet Lab • Brocade, Mobile Tapestry, and Fault-Tolerant Tapestry • Overlay routing layer providing Sahara with efficient application-level object location and routing • Mobility support, fault-tolerance, varying delivery semantics

Outline • Griffin • Motivation • Goals • Components • Tapas Update • Motivation • Data preconditioning-based network modeling • Model accuracy issues and validation • Domain analysis

Tapas Motivation • Accurate modeling and emulation for protocol design • Very difficult to gain access to new or experimental networks • Delay, error, congestion in IP, GSM, GPRS, 1xRTT, 802.11a/b • Study interactions between protocols at different levels • Creating models/artificial traces that are statistically indistinguishable from traces from real networks • Such models have both predictive and descriptive power • Better understanding of network characteristics • Can be used to optimize new and existing protocols

Tapas • Novel data preconditioning-based analysis approach • More accurately models/emulates long-/short-term dependence effects than classic approaches (Gilbert, Markov, HMM, Bernoulli) • Analysis, simulation, modeling, prediction tools: • MultiTracer: Multi-layer trace collection and analysis (download) • Trace analysis and synthetic trace generator tools • Markov-based Trace Analysis, Modified hidden Markov Model • WSim: Wireless link simulator (currently trace-driven) • Simple feedback algorithm and API • Domain analysis tool: chooses most accurate model for a metric • Error-tolerant radio / link layer protocols: RLPLite, PPPLite • Collected >5,000 minutes of TCP, UDP, RLP traces in good/bad, stationary/mobile environments (download)

Application Application Packetization Packetization RTP RTP Socket Interface Socket Interface TCP/UDP (Lite) TCP/UDP (Lite) PSTN IP IP Fixed Host Unix BSDi 3.0 PPP/PPP Lite GSM Base Station Mobile Host Unix BSDi 3.0 PPP/PPP Lite GSM Network MultiTracer Plotting & Analysis MultiTracer Measurement Testbed • Multi-layer trace collection • RLP, UDP/TCP, App • Easy trace collection • Rapid, graphical analysis RLP / non RLP RLP / non RLP SocketDUMP TCPdump TCPstats RLPDUMP 300 B/s SocketDUMP TCPdump TCPstats

real network metric trace trace analysis algorithm network model artificial network metric trace Choosing the Right Network Model • Collect empirical packet trace: T = {1,0}* • 1: corrupted/delayed packet, 0: correct/non-delayed packet • Create mathematical models based on T • T may be non-stationary (statistics vary over time) • Classic models don’t always work well (can’t capture variations) • MTA, M3 – Trace data preconditioning algorithms • Decompose T into stationary sub-traces & model transitions • Stationary sub-traces can be modeled with high-order DTMC • Markov-based Trace Analysis (MTA) and Modified hidden Markov Model (M3) tools accurately model time varying links

Good Subtrace Bad Subtrace Good Subtrace Bad Subtrace c c Error Trace … 10001110011100….0 0000…0000 11001100…00 00000..000... Bad Trace … 10001110011100….0 11001100…00 ... Good Trace … 0000…0000 00000..000... Creating Stationarity in Traces • Our idea for MTA and M3: decompose T into stationary sub-traces • Bad sub-traces B1..n = 1{1,0}*0c, Good sub-traces G1..n = 0* • C is a change-of-state constant: mean + std dev of length of 1* • MTA: Model B with a DTMC, model state lengths with exponential distribution, and compute transitions between states • M3: Similar, but models multiple states using HMM to transition Model B with DTMC

Issues in Modeling • Evaluating the accuracy of a particular model • How closely does it model a network characteristic? • How much trace data do we need to collect to accurately model a network characteristic? • How much work? • Can a model be used to accurately model a network scenario? • I.e., can we model a case like poor fixed indoor coverage and use the model to model conditions at a later time?

Evaluating Model Accuracy • Determine CDFs of burst lengths in Lossy and Error Free subtraces of a collected trace • Create Model • Use model to generate an artificial trace and determine CDFs of Lossy and Error Free subtraces • Calculate correlation coefficient (cc) between Lossy and Error-Free CDFs of collected and artificial traces • Observation: Accurate models have cc > 0.96

Model Evaluation Methodology • What size collected trace is needed for accurate model? • Sub-divide trace into subtraces of length len/2j • Compare cc values between subtraces and collected trace • Trace lengths > max(EF burst size) yield cc > 0.96 • How representative is a model? • Collect large trace, AB, and sub-divide into A and B • Create model from A • Use cc to compare model A with A, B, and AB • Representative models have all cc values > 0.96

Challenge: Domain Analysis • Which model to use? • Gilbert, HMM, MTA, M3 have different properties • Algorithm (applied to Gilbert, HMM, MTA, M3): • Collect traces, compute exponential functions for lengths of good and bad state and compute 1’s density of bad state • For a given density, determine model parameters and optimal model (best cc) • Experiment: • Apply to artificial network environment with varying bad state densities • Plot optimal model as a function of the good and bad state exponential values: Domain of Applicability Plot

Domain of Applicability Plot, Lden= 0.2b

Domain of Applicability Plot , Lden= 0.7

Griffin Summary • On-going Tapas work: • Sigmetrics 2003 submission on domain analysis • Trace collection: CDMA 1xRTT, GPRS, & IEEE 802.11a, PlanetLab IP • Release of WSim • Dissertation (Almudena Konrad) • Tapestry and MINO talks at retreat • In joint and ROC/OS sessions

Griffin Update: Towards an Agile, Predictive Infrastructure Anthony D. Joseph UC Berkeley http://www.cs.berkeley.edu/~adj/ Sahara Retreat, January 2003

Griffin Update: Towards an Agile, Predictive Infrastructure

Griffin Update: Towards an Agile, Predictive Infrastructure

Presentation Transcript

Agile Systems Development

GRIFFIN v. CALIFORNIA 380 U.S. 609 (1965)

Condition Monitoring and Predictive Maintenance

AGILE SOFTWARE DEVELOPMENT

Predictive Biomarkers for tailoring drug therapy

Predictive Analysis with SQL Server 2008

An Agile Accounting Model: Key to Enterprise Agile

Agile Software Development

Chapter 3 – Agile Software Development

Chapter 3 – Agile Software Development

Agile Infrastructure IaaS Compute

What is Predictive Modeling?

Agile Contracts ?

Sustainable MIS Infrastructure

Migration from ELFMs to Agile Infrastructure

Chapter 3 – Agile Software Development

IHE IT Infrastructure Domain Update

Documentation in Agile Development

Update on Market Reform

UK Infrastructure Conference – 2 nd December, 2013

Agile Development Infrastructure

CERN Agile Infrastructure