Statistical Modeling in Stochastic Dynamic Programming for a Decision-Making Framework

Outline. Decision-Making Framework Stochastic Dynamic Programming

Statistical Modeling in Stochastic Dynamic Programming for a Decision-Making Framework

Presentation Transcript

  Dr. Julia C. TsaiKrannert School of ManagementPurdue University December 15, 2003

  2. Outline • Decision-Making Framework • Stochastic Dynamic Programming • Statistical Modeling within the DMF • Multivariate Adaptive Regression Splines • Parallel MARS • Flexible Implementations of MARS • DMF Results • Conclusions

  3. A Modular Decision-Making Framework For time period/level/stage t: xt = state of systemut = decision/control

  4. Stochastic Dynamic Programming • To solve a problem of different periods/levels/stages • Applications: • Inventory Forecasting: • -- Up to 9 dimensions (Chen 1999) • Airline Revenue: • -- 31 flight legs (Chen, Günther, Johnson 2000) • Wastewater Treatment System: • -- 20 dimensions (Tsai et al. 2002)

  5. Inventory Forecasting Modeled by Heath and Jackson (1991) using the Martingale Model of Forecast Evolution Objective:Minimize inventory holding and backorder costs. Time Periods/Levels/Stages:Months, weeks. Statextat the beginning of Stage t:Inventory levels and product forecasts. Decisionutin Stage t:Amount ordered. Constraints:Capacities on order quantities. Random Variables:Errors in the forecasts. Transition:For inventory xt+1 = xt  demand + order quantity.

  6. Airline Revenue Management Research with Ellis Johnson (Georgia Tech), Dirk Günther (Sabre), and Jay Rosenberger (UTA) Objective:Maximize revenue before a specified departure date. Time Periods/Levels/Stages:weeks, days. Statextat the beginning of Stage t:Remaining capacities on the flight legs in the network. Decisionutin Stage t:Accept or Reject a customer’s airfare request for a specified origin-destination itinerary. Constraints:Capacities on flight legs. Random Variables:Customer demand. Transition:xt+1 = xt  # seats sold in stage t.

  7. Wastewater Treatment System[1] • 11-level liquid line and 6-level solid line • At each level, select one of several unit processes to complete the treatment system • Objectives: • Evaluate various technologies in different levels • Identify which technologies should be explored more in the future • 1 Developed by Dr. Bruce Beck and Dr. Jining Chen

  8. State Variables: To measure the quality of water

  9. Technology Units Liquid Line:

  10. Solid Line:

  11. Objectives of the SDP: • To minimize • Economic Cost (Capital & Operating) • Odor Emissions • Size of treatment system (land area or volume) • or Maximize • Robustness against extreme conditions • Desirability of the global environment • Constraints: • 1. Cleanliness of the influent entering each level • 2. Stringent clean water targets exiting the final level of the system

  12. Stochastic Dynamic Programming (SDP) Objective:Minimize expected cost over T stages. Optimal Value FunctionFt(xt)in Stage t:Minimum expected cost to operate the system over stages t through T.

  13. Algorithm for Continuous-State SDP • ChooseS discretization pointsin the state space. • In eachstage t= T,…,1: • At eachdiscretization point xj, j = 1, … , S: minimize the expected cost  value of • Approximate with (Chen, Ruppert, Shoemaker 1999)

  14. Statistical Modeling Process SDP Period/Stage/Level t+1 SDP Period/Stage/Level t Experimental Design State Vector Values Optimization Data for the Future Value Function Statistical Model Estimated Future Value Function SDP Period/Stage/Level t-1

  15. Design of Experiments Eachexperimental run • sets eachfactorat a specific level • corresponds to apointin the n-dimensional space

  16. Design of Experiments Options • FF: Full factorialor complete grid designs • OA: Orthogonal arraydesigns (Bose and Bush 1952, Chen 2001) • LH: Latin hypercubedesigns (McKay et al. 1979) • OA-LH: Hybrid (Tang 1993)

  17. Orthogonal Array Designs • OA Parameters: • n factors • strength d (d < n) • p levels • frequency  • Whenprojecteddown onto anyd dimensions, it produces aFF gridof pd points replicated  times. • A LH designis equivalent to anOA of strength 1.

  18. Cubic Regression Splines Univariate cubic regression splines commonly have the form:

  19. MultivariateAdaptive Regression Splines – + 1 va ka – + 2 vb kb 3 4 B1 = H[–(Xva–ka)] , B2 = H[+(Xva–ka)] B3 = H[–(Xva–ka)]H[–(Xvb–kb)] B4 = H[–(Xva–ka)]H[+(Xvb–kb)]

  20. MARS Forward Stepwise • Loop through potential new basis functions: •  Select parent basis function m •  Select variable v •  Select knot k • For each m, v, k: •  Compute lack-of-fit •  Compare to current best based on lack-of-fit • For the best m, v, k: Create two new basis functions • 4. Continue searching for new basis functions until the stopping rule (e.g. Mmax) is met

  21. Parallel MARS • Master-Slave paradigm • Software:MPI (Message-Passing Interface)

  22. Parallel MARS Algorithm C0 : Initialization/Data Processing. C1 C2 CP-1 C0 : Select the overall best knot and update b.f. Meet Mmax? NO YES STOP

  23. Parallel Performance Measure: • tP: Time using Parallel MARS with P processors • t1: Time using Parallel MARS with 1processor • Speedup (SP) = t1/ tP • Computing Facility: • Processor: 550 MHz Pentium III Xeon • Storage: 4 GB RAM, 18 GB SCSI disk • OS: RedHat Linux 7.1

  24. Results Speedup vs. No. of Processors [ N = 289, K = 35]

  25. The Drawbacks of MARS • Mmax is difficult to select • Different SDP time periods may require different Mmax for a good approximation • Computational effort required to identify the best Mmax for each time period is impractical • Multiple basis functions can be “equivalently” good based on lack-of-fit • MARS is a greedy algorithm • Final approximation may involve more higher-order interaction terms than necessary

  26. ASR-MARS(Automatic Stopping Rule) • Use of R2 and R2a: (adjusted) coefficient of determination • ASR-I:Stop MARS approximation search process when R2 <  or  R2a <  • ASR-II:Stop MARS approximation search process when R2 / R2 <  or  R2a / R2a < 

  27. Results Mmax Relaxation: Slow vs. ASR-I ( =0.0002) Run Time: MAD (mean absolute deviation) & M (number of basis functions):

  28. Robust MARS • Choose lower-order interaction terms For example: The highest allowable interaction term is 3, then three I(i, Bi) are used to store the best basis function (Bi): I(1, B1) = among univariate options I(2, B2) = among two-way interaction options I(3, B3) = among three-way interaction options

  29. AssumeI(3, B3) > I(2, B2) > I(1, B1) Start NO The best b.f. is B3 YES NO The best b.f. is B2 YES The best b.f. is B1

  30. Results Robust MARS Results

  31. DMF Evaluation Measures • Count = # times chosen as best • MOD = mean overall deviation • MLD = mean local deviation • MLRD = mean local relative deviation • A promising technology hashigherCount andlowerMOD, MLD, MLRD

  32. Results DMF Solution (Count): Slow vs. ASR

  33. Conclusions • Parallel-MARS: Speedup becomes more significant as Mmax increases • ASR-MARS: Tremendously reduced runtime for the statistical modeling process, and selected the same promising technologies as “Slow” Mmax relaxation • Robust MARS: Reduced the mean absolute deviation of the test data set, which suggested a better statistical model

  34. Thank You

