620 likes | 949 Vues
A Finger in the Wind: Forecasting Techniques for Capacity Planning. Linwood Merritt Bank of America May, 2005 lin.merritt@bankofamerica.com. Purpose of Presentation.
 
                
                E N D
A Finger in the Wind:Forecasting Techniques for Capacity Planning Linwood Merritt Bank of America May, 2005 lin.merritt@bankofamerica.com
Purpose of Presentation Robust Capacity Planning entails the analysis of data to project future demand. This analysis can be as simple as a linear trend of historical demand data, and can be as complex as multivariate regression using business drivers combined with analysis of changing business functionality. This presentation is intended as a “Forecasting 101” introduction to trending techniques, and discusses the use of Excel and SAS to perform trend analysis of computer resources and produce Capacity Planning forecasts.
Categories of Forecasting Approaches • Date-based trending • Linear • Exponential • Business driver trending • Single variable • Multiple variables
Comparison of Tools • Basic analysis of a relatively small amount of data: spreadsheet product such as Excel. • Automation of projections over a large amount of data and instances: programming language such as SAS.
Date-Based Trending • “Steering by your wake:” using the past to predict the future. • Linear: use of “Least Squares” algorithm; single answer to set of differential equations. • Exponential: “Least Squares” trending of exponents.
Types of Growth • Linear: straight line (same delta each interval) • Exponential: compound growth (same percentage each interval)
Linear Calculations • Linear: straight line (y = b + mx) Linear annual growth: take difference of projections one-year-apart. Linear monthly growth = (annual growth difference) / 12
Linear Growth Example • Jan 2004 = 100, Jan 2005 = 120 • Linear Annual Growth = 120-100 = 20 • Monthly Growth = 20/12 = 1.67 • Monthly projections = 100, 101.7, 103.3, … • Annual Projections = 100, 120, 140, …
Exponential Calculations • Exponential: compound growth (y = b * m x) log (y) = log b + x * log m y’ = b’ + m’ x Take linear trend of log(y), convert back with exp(y). Compound annual growth: take ratio of projections one-year-apart. Compound monthly growth = (compound annual growth) (1/12)
Exponential Growth Example • Jan 2004 = 100, Jan 2005 = 120 • Compound Annual Growth = 120/100 = 1.2 • Jan 2005 is 120% of Jan 2004, a 20% compound annual growth. • Monthly Growth = 1.2(1/12) = 1.015 • Monthly projections = 100, 101.5, 103.1, … • Annual Projections = 100, 120, 144, …
Linear Regression Basics • Formula: y = b + mx • x = independent variable (Date or Business Driver) • y = dependent variable (Result) • m = “slope” of straight line • Excel: “X Variable 1” • SAS: date or business driver coefficient • b = Intercept • Excel (if date-based): # days since 12/31/1899 • SAS (if date-based): # days since 1/1/1960
Linear Date-Based Trending Using Excel “Trend” Function =TREND($B$2:$B$32,$A$2:$A$32,A2)
Excel Data Analysis Regression Tool =+$B$17+$B$18*B20 Intercept + (X Variable 1) * Month
Linear Date-Based Trending in SAS proc reg noprint data=cmg outest=reglin tableout; model hist = date / selection=rsquare; data proj; set reglin (in=r) cmg (in=c); retain il dl 0; if r then do; if hist = -1; il = intercept; dl = date; delete; end; projl = il + date*dl;
SAS “Proc Reg” Output Obs _MODEL_ _TYPE_ _DEPVAR_ _RMSE_ Intercept date hist _IN_ _P_ _EDF_ _RSQ_ 1 MODEL1 PARMS hist 33085.82 -2693937.76 201.897 -1 1 2 29 0.74648 Obs date hist projl 1 01JAN01 350000 329665.30 2 01FEB01 340000 335924.09 3 01MAR01 410000 341577.19 4 01APR01 320000 347835.99 37 01JAN04 . 550742.04 38 01FEB04 . 557000.84 39 01MAR04 . 562855.84 40 01APR04 . 569114.63 41 01MAY04 . 575171.53
Exponential Date-Based Trending Using Excel “Growth” Function =GROWTH($B$2:$B$32,$A$2:$A$32,A2)
Exponential Date-Based Trending Using Logarithms =EXP(TREND($C$2:$C$32,$A$2:$A$32,A2))
Exponential Date-Based Trending in SAS histl = log(hist); proc reg noprint data=cmg outest=regexp tableout; model histl = date / selection=rsquare; data proj; set regexp (in=r) cmg (in=c); retain il dl 0; if r then do; if hist = -1; ie = intercept; de = date; delete; end; proje = exp (ie + date*de);
SAS Exponential Regression Output Obs _MODEL_ _TYPE_ _DEPVAR_ _RMSE_ Intercept date histl _IN_ _P_ _EDF_ _RSQ_ 1 MODEL1 PARMS histl 0.081097 5.49908 .000482211 -1 1 2 29 0.73655 Obs date hist proje 1 01JAN01 350000 334592.56 2 01FEB01 340000 339631.79 3 01MAR01 410000 344248.56 4 01APR01 320000 349433.23 37 01JAN04 . 567326.04 38 01FEB04 . 575870.44 39 01MAR04 . 583980.04 40 01APR04 . 592775.26 41 01MAY04 . 601412.86
Footnote: SAS/Graph Linear Trending symbol1 i=rl color=blue v=w h=.5; proc gplot data=cmg; plot hist * date / regeqn;
Business Driver Based Forecasting • Linear “least squares” regression using business driver instead of date as independent variable • Formula: y = b + mx • x = independent variable (Business Driver) • y = dependent variable (Result) • m = “slope” of straight line • Excel: “X Variable 1” • SAS: business driver coefficient • b = Intercept
Business Driver Based Forecasting Steps • Map business drivers to servers. • Use historical data to correlate. • Use business driver projections to build forecast.
Single Business Driver Trending Using Excel “Trend” Function =TREND($B$2:$B$32,$C$2:$C$32,C2)
Single Business Driver Trending Using Excel Data Analysis =+Business1!$B$17+Business1!$B$18*H2 Intercept + (X Variable 1) * BusDriverA
Single Business Driver Trending in SAS proc reg noprint data=cmg outest=regbusa tableout ; model hist = busa / selection=rsquare; data proj; set regbusa (in=a) cmg (in=c); retain ia da 0; if a then do; if hist = -1; ia = intercept; da = busa; delete; end; proja = ia + busa*da;
SAS Business Driver Regression Output Obs _MODEL_ _TYPE_ _DEPVAR_ _RMSE_ Intercept busa hist _IN_ _P_ _EDF_ _RSQ_ 1 MODEL1 PARMS hist 33330.20 128231.62 59.0956 -1 1 2 29 0.74272 Obs date hist busa proja 1 01JAN01 350000 3500 335066.38 2 01FEB01 340000 3600 340975.94 3 01MAR01 410000 3700 346885.51 4 01APR01 320000 3900 358704.64 37 01JAN04 . 7900 595087.21 38 01FEB04 . 7800 589177.65 39 01MAR04 . 7900 595087.21 40 01APR04 . 8000 600996.78 41 01MAY04 . 8200 612815.91
Multiple Business Drivers • Linear “least squares” multivariate regression using multiple business drivers as independent variables • Formula: y = b + ma + nb + … • a, b, … = independent variables (Business Drivers) • y = dependent variable (Result) • m, n, … = coefficients of each Business Driver
Multiple Business Driver Trending Using Excel “Trend” Function =TREND($B$2:$B$32,$C$2:$D$32,C2:D2)
Multiple Business Drivers Using Excel Data Analysis Regression
Data Analysis Projections for Multiple Business Drivers = Business2!$B$17 +Business2!$B$18*'Regression Data'!H2 +Business2!$B$19*'Regression Data'!I2 Intercept + (X Variable 1) * BusDriverA + (X Variable 2) * BusDriverB
Multiple Business Driver Trending in SAS proc reg noprint data=cmg outest=regbusab tableout ; model hist = busa busb / selection=rsquare; data proj; set regbusab (in=b) cmg (in=c); retain ib da db 0; if b then do; if hist = -1 and _IN_ = 2; ib = intercept; da = busa; db = busb; delete; end; projb = ib + busa*da + busb*db;
SAS Multivariate Regression Output Obs _MODEL_ _TYPE_ _DEPVAR_ _RMSE_ Intercept busa busb hist _IN_ _P_ _EDF_ _RSQ_ 1 MODEL1 PARMS hist 33330.20 128231.62 59.0956 . -1 1 2 29 0.74272 2 MODEL1 PARMS hist 54624.94 205255.71 . 253.480 -1 1 2 29 0.30895 3 MODEL1 PARMS hist 33901.16 131950.59 60.1175 -10.300 -1 2 3 28 0.74301 Obs date hist busa busb projb 1 01JAN01 350000 3500 500 337211.58 2 01FEB01 340000 3600 600 342193.28 3 01MAR01 410000 3700 640 347793.01 4 01APR01 320000 3900 800 358168.43 37 01JAN04 . 7900 1000 596578.32 38 01FEB04 . 7800 1020 590360.56 39 01MAR04 . 7900 950 597093.34 40 01APR04 . 8000 990 602693.07 41 01MAY04 . 8200 980 614819.58
Negative Coefficient • A negative coefficient should send a warning flag: Intercept = 131950.59 X Variable 1 = 60.1175 X Variable 2 = -10.300 For 01JAN04, Projection = 131950.6 + 60.1*7900 - 10.3*1000 = 596578 If Business Driver B decreases, the projection increases: Projection = 131950.6 + 60.1*7900 - 10.3*500 = 601729 If the business driver increases enough, the projection may be negative!
Negative CoefficientsNonlinear regression of a linear equation (w/ bounds on coefficients) proc nlin noprint data=cmg outest=regnlin maxiter=10000 method=NEWTON; parms intercept=0 Ren1 = .00000003 Ren2 = .00000003; bounds 0<=Intercept,0<=Ren1, 0<=Ren2; model hist = intercept + busa*Ren1 + busb*Ren2; data proj; set regnlin (in=n) cmg (in=c); retain ni na nb 0; if n then do; if _TYPE_ =: 'FINAL'; ni = intercept; na = ren1; nb = ren2; delete; end; projn = ni + busa*na + busb*nb;
SAS Non-Linear Regression Output Obs _TYPE_ _STATUS_ _NAME_ _ITER_ _SSE_ intercept Ren1 Ren2 1 ITER 2 Iteration 0 5.6357E12 0.00 0.00 3E-8 2 ITER 2 Iteration 1 5.6357E12 0.00 0.00 1.081384E-8 3 ITER 2 Iteration 2 5.6357E12 0.00 0.00 4.818163E-9 4 ITER 2 Iteration 3 5.6357E12 0.00 0.00 1.070866E-9 5 ITER 2 Iteration 4 5.6357E12 0.00 0.00 4.8535E-10 6 ITER 2 Iteration 5 5.6357E12 0.00 0.00 1.19403E-10 7 ITER 2 Iteration 6 5.6357E12 0.00 0.00 5.04498E-12 8 ITER 2 Iteration 7 5.6357E12 0.00 0.00 -6.6429E-11 9 ITER 2 Iteration 8 5.6357E12 0.00 0.00 -8.8765E-11 10 ITER 2 Iteration 9 5.6357E12 0.00 0.00 -9.5745E-11 11 ITER 2 Iteration 10 5.6357E12 0.00 0.00 -9.7926E-11 12 ITER 2 Iteration 11 5.6357E12 0.00 0.00 -9.9289E-11 13 ITER 2 Iteration 12 5.6357E12 0.00 0.00 -9.9715E-11 14 ITER 2 Iteration 13 5.6357E12 0.00 0.00 -9.9981E-11 15 ITER 2 Iteration 14 5.6357E12 0.00 0.00 -9.9992E-11 16 ITER 2 Iteration 15 5.6357E12 0.00 0.00 -9.9998E-11 17 ITER 2 Iteration 16 5.6357E12 0.00 0.00 -9.9998E-11 18 ITER 2 Iteration 17 32216155095 128231.62 59.10 0 19 FINAL 0 Converged . 32216155095 128231.6259.100 20 COVB 0 Converged intercept . 32216155095 1063952717.10 -207093.13 0 21 COVB 0 Converged Ren1 . 32216155095 -207093.13 41.71 0 22 COVB 0 Converged Ren2 . 32216155095 0.00 0.00 0
SAS Non-Linear Regression Results Obs date hist busa busb projn 1 01JAN01 350000 3500 500 335066.38 2 01FEB01 340000 3600 600 340975.94 3 01MAR01 410000 3700 640 346885.51 4 01APR01 320000 3900 800 358704.64 37 01JAN04 . 7900 1000 595087.21 38 01FEB04 . 7800 1020 589177.65 39 01MAR04 . 7900 950 595087.21 40 01APR04 . 8000 990 600996.78 41 01MAY04 . 8200 980 612815.91
Business Driver Scenario Analysis • Example: Business initiative with incremental daily projections • Approach: • Create new monthly totals for affected business driver(s). • Run Business Driver based projections with new business driver forecast.
Business Driver Scenario Example Sum daily projections for each month New Business Driver projections (add Scenario to baseline, create new graphical projections) Create document with text and graphical analysis
Factoring Functionality Growth • Single business driver • Compare business driver growth with resource metric growth • Calculation: (Business Driver input) / (Business Driver Trended) * (Resource Trended)
Functionality Trending Using Growth Comparisons =+D2/E2*C2 BusA/(Trended BusA) * (Trended Hist)
Functionality Regression Using Date and Business Drivers =TREND($C$2:$C$32, $A$2:$B$32,A2:B2)
Using Growth Rates =IF($C3>0,+D3*$C3^(1/12),D3*$B3^(1/12))
Growth Rate Calculations (Bulk Capacity Planning) (#Years) = ( Log(Threshold) - Log(Base) ) / Log(1+AnnualGrowth) (#Years) = (UpgradeDate - BaseDate) / 365.25 UpgradeDate = 365.25 * (Log(Threshold)-Log(Base)) / Log(1+AnnualGrowth%) + BaseDate Examples: • Base Date = September 2003, Base Utilization = 60%, Threshold = 80%, Annual Growth =20% • Base Date = September 2003, Base Utilization = 50%, Threshold = 80%, Annual Growth =15% • Base Date = September 2003, Base Utilization = 40%, Threshold = 70%, Annual Growth =25%
Bulk Capacity Planning Example =365.25 * (LOG(D14)-LOG(C14)) / LOG(1+E14) + B14 365 * (LOG(Upgrade%)-LOG(Base CPU%))/LOG(1+CAGR%) + Base Date
Mainframe Workload Projections Mainframe Average Dayshift Projected MIPS by Month System Group AUG04 SEP04 OCT04 NOV04 DEC04 JAN05 CAGR Av/Pk _______ __________________________ ______ ______ ______ ______ ______ ______ ______ ______ Node 1 425.0 431.5 438.1 444.8 451.6 458.5 20.0% .7500 Application A 11.1 11.5 11.9 12.3 12.7 13.2 50.0% .4400 Application B 2.2 2.3 2.4 2.5 2.6 2.7 50.0% .8298 Application C 28.7 25.2 22.2 19.5 17.2 15.1 -78.2% .6461 Application D 14.1 14.3 14.6 14.8 15.1 15.3 22.1% .6258 Application E 1.4 1.0 0.7 0.5 0.4 0.3 -98.1% .6518 Application F 6.8 5.9 5.1 4.4 3.8 3.3 -81.7% .8257 Application G 15.0 15.5 16.0 16.6 17.1 17.7 50.0% .5037 Small 163.6 163.5 163.3 163.2 163.1 162.9 -1.0% .7539