MDP and Queues: Optimization, Probability, and Review
E N D
Presentation Transcript
Previously • Optimization • Probability Review • Inventory Models • Markov Decision Processes
Agenda • Hwk • Projects • Markov Decision Processes • Queues
Markov Decision Processes (MDP) • States i=1,…,n • Possible actions in each state • Reward R(i,k) of doing action k in state i • Law of motion: P(j | i,k)probability of moving ij after doing action k
MDP f(i) = largest expected current + future profitif currently in state i f(i,k) = largest expected current+future profitif currently in state i, will do action k f(i) = maxk f(i,k) f(i,k) = R(i,k) + ∑j P(j|i,k) f(j) f(i) = maxk [R(i,k) + ∑j P(j|i,k) f(j)]
MDP as LP f(i) = maxk [R(i,k) + ∑j P(j|i,k) f(j)] Idea: f(i) decision variables piecewise linear function min ∑j f(i) s.t. f(i) ≥ R(i,k) + ∑j P(j|i,k) f(j) for all i,k
MDP Examples • Breast cancer screening • Stock options • Airline ticket pricing • Elevator scheduling • Reservoir management
the “system” Queues (Ch 14) • Queue = waiting line image from http://staff.um.edu.mt/jskl1/simweb/intro.htm
Examples • Airport security • Customer service line • Checkout • Doctor’s office • ER • Canada: scheduling operations • Elevators
T time in system Tq waiting time (time in queue) N #customers in system Nq #customers in queue W = E[T] Wq= E[Tq] L = E[N] Lq= E[Nq] fraction of time servers are busy (utilization) departures arrivals queue servers system Performance Measures
Randomness is Key waiting time customer # arrivals every 15 min (not random) processing times random with mean of 13 min (exponential random variable)