710 likes | 954 Vues
Sometimes it Pays to be Greedy: Greedy Algorithms in Economic Epidemiology Fred Roberts, DIMACS. Optimization Problems in Economic Epidemiology. Many problems in Economic Epi can be formulated as optimization problems: Find a solution that maximizes or minimizes some value.
E N D
Sometimes it Pays to be Greedy:Greedy Algorithms in Economic EpidemiologyFred Roberts, DIMACS
Optimization Problems in Economic Epidemiology Many problems in Economic Epi can be formulated as optimization problems: Find a solution that maximizes or minimizes some value. Find the optimal location for a hospital. Find the optimal assignment of health care workers to jobs. Optimize investment in health care supplies. Minimize the total cost of a series of medical tests or public health interventions. Control an outbreak with as small an investment in vaccines as possible.
Greedy Algorithms Often, the simplest approach to an optimization problem is a greedy algorithm: Choose the best (cheapest, highest-rated,…) available alternative at each step. In general, greedy algorithms will find locally optimal solutions, but not globally optimal ones. Global optimum Local optimum
Greedy Algorithms We give examples from Economic Epi: Some where a greedy solution achieves a global optimum Others where it doesn’t, but we can either make modifications or get a bound on how far from optimal we are.
Outline Four Applications of Classical Operations Research Methods Vaccination Strategies for Control of a Highly Infectious Disease Spreading through a Social Network Algorithms for Sequential Public Health or Medical Decision Making
Classic Example I: Assigning Health Care Workers to Jobs n workers W1, W2, …, Wn m jobs J1, J2, …, Jm We know which workers are qualified to do which jobs and the cost of using each worker. Goal: assign workers to jobs they are qualified for, each to at most one job, filling as many jobs as possible, and among all ways of filling as many jobs as possible, find the way to do it with minimum total cost. This is known as the Minimum Cost Assignment Problem
Assigning Health Care Workers to Jobs Greedy Algorithm: At each stage, add the least expensive worker to those getting job assignments if there is an acceptable (feasible) assignment using that worker and all those who have previously been assigned jobs, switching job assignments if necessary. The greedy algorithm always gives an optimal job assignment.
Classic Example II: Investing in Health Care Options AIDS Prevention Options Option 1: Condoms Option 2: Educational Posters Option 3: Clean Needles to Distribute Option 4: Testing Option 5: Funded Researchers Suppose we are faced with a selection of health care options in which to invest. Option i has an estimated cost ci and an estimated value vi. Alternative health care facilities Alternative supplies for a clinic Alternative research programs Problem: Determine which ones to invest in so that the total cost is within budget and the total value is as large as possible.
Investing in Health Care Options Knapsack Problem Maximize i vixi Subject to i cixi≤ B where xi = number of items i chosen Variants xi = 0 or 1 xi {0, 1, …, bi} Bounded Knapsack Problem xi is any integer Unbounded Knapsack Problem
Investing in Health Care Options Greedy Algorithm Due to George Dantzig 1957 Sort items in decreasing order of value per unit cost: vi/ci Pick as many copies of the first item as possible until no more are possible or until one more would violate i cixi≤ B. Continue in the same way with the second item, then the third, etc. For the unbounded knapsack problem, this algorithm always achieves at least half of the value obtained by the optimal solution. Is this acceptable? It depends on the application: do you need a fast decision?
Classic Example III: Locating Health Care Facilities We have a number of users of a planned set of health care facilities. Where do we put the facilities and how do we assign a user to a facility?
Locating Health Care Facilities There are two costs: fi = cost of opening a facility at i cij = cost of sending user at j to facility at i Let F = sum of fi over all opened facilities. Let C = sum of costs cij over all users j. We want to minimize F+ C. Assume that there is no limit to the number of facilities we might open. However, there is a tradeoff between increased cost of more facilities and decreased cost of getting to a nearby facility. This is the Uncapacitated Facility Location Problem Uncapacitated since we have no limit on the number of facilities.
1 a f 1 1 e b 1 1 c d 1 Locating Health Care Facilities Cost 5 Cost 0.5 Cost 4 Cost 3 Numbers on edges are costs of moving along the edge Cost 6 Cost 1 Given users at red circled locations, where do we locate facilities to minimize F+C?
Locating Health Care Facilities Greedy Algorithm Due to Charikar and Guha (2004) First find a preliminary solution S. Order the nodes of the network in order of increasing cost of locating a facility at the node. Choose p so that if S is the set of the first p facilities, then the cost F + C associated with S is as small as possible. Modify the preliminary solution in a series of steps by randomly selecting nodes to add to S and subsets of nodes to remove from S.
Locating Health Care Facilities Charikar and Guha show that, given , the algorithm is guaranteed to achieve a cost F+C that is at most 2F* + 3C* + (F* + C*) in at most O(nlog(n/ ) steps, where F* and C* are costs associated with an arbitrary optimal solution.
Classic Example IV: Rerouting Emergency Vehicles in Case of Floods New initiative in Climate and Health at DIMACS.
Extreme Events due to Global Warming We anticipate an increase in number and severity of extreme events due to global warming. More heat waves. More floods, hurricanes.
Extreme Events due to Global Warming Areas of Emphasis in DIMACS Climate & Health Initiative Evacuations during extreme heat events Rolling power blackouts during extreme heat events Pesticide applications after floods Emergency vehicle rerouting after floods
Minimum Spanning Tree Problem 2 26 10 14 15 22 20 8 28 16 • A spanning tree is a tree using the edges of the graph and containing all of the nodes. • It is minimum if the sum of the numbers on the edges used is as small as possible. • Red edges define a minimum spanning tree.
Minimum Spanning Tree Problem • Minimum spanning trees arise in many applications. • One example: Given a road network, find usable roads that allow you to go from any node to any other node, minimizing the lengths of the roads used. • This problem arises in the DIMACS Climate and Health project: Find a usable road network for emergency vehicles in case extreme events leave flooded roads.
Minimum Spanning Tree Problem • Kruskal’s algorithm (greedy algorithm): • List the edges in order of increasing weight. • For each edge, greedily include it if it does not form a cycle with edges already chosen. • Stop when no more edges can be included. • Kruskal’s algorithm gives an optimal solution.
Vaccination Strategies for Control of a Highly Infectious Disease Spreading through a Social Network Work with Paul Dreyer and Stephen Hartke
t=0,1,2, … The Model: Moving From State to State Social Network = Graph Nodes = People Edges = contact SI model Once in infected state, stay there. Times are discrete: t = 0, 1, 2, … = infected = susceptible
Disease Process Highly Infectious Disease: You change your state from to at time t+1 if at least one of your neighbors have state at time t. You never leave state .
Vaccination Strategies Let’s say you have a limited amount of vaccine available each time period, say v doses. Whom should you vaccinate?
Vaccination Strategies More precisely: What vaccination strategy minimizes number of people ultimately infected if a disease breaks out with one infection? Sometimes called the firefighter problem: alternate fire spread and firefighter placement.
Some Results on the Firefighter Problem Thanks to Kah Loon Ng DIMACS for some of the following slides, slightly modified by me
Some questions that can be asked (but not necessarily answered!) • Can the fire be contained? • How many time steps are required before fire is contained? • How many firefighters per time step are necessary? • What fraction of all nodes will be saved (burnt)? • Does where the fire breaks out matter? • Fire starting at more than 1 node? • Consider different graphs. Construction of (connected) graphs to minimize damage. • Complexity/Algorithmic issues
Containing Fires in Infinite Grids Ld Fire starts at only one node: d= 1: Trivial. d = 2: Impossible to contain the fire with 1 firefighter per time step
8 time steps 18 burnt nodes Containing Fires in Infinite Grids Ld d = 2: Two firefighters per time step needed to contain the fire.
Containing Fires in Infinite Grids Ld Wang and Moeller (2002): If d 3, 2d-1 firefighters per time step are sufficient to contain any outbreak starting at a single node. Hartke 2004: If d 3,2d – 2 firefighters per time step are not enough to contain an outbreak in Ld. Thus, 2d – 1 firefighters per time step is the minimum number required to contain an outbreak in Ld and containment can be attained in 2 time steps.
Firefighting on Trees Epidemic starts at the root. Number doses of vaccine: v = 1
Firefighting on Trees Greedy algorithm: For each node x, define weight (x) = number descendants of x + 1 Algorithm: At each time step, place firefighter at node that has not been saved such that weight (x) is maximized.
26 Firefighting on Trees: 22 12 8 9 7 11 2 6 1 5 1 6 1 4 2 3 3 1 3 1 1 1 1 1 1 2 1 Firefighting on Trees
= 7 = 9 Firefighting on Trees Greedy Optimal
Firefighting on Trees Theorem (Hartnell and Li, 2000): For any tree with one fire starting at the root and one firefighter to be deployed per time step, the greedy algorithm always saves more than ½ of the nodes that any algorithm saves.
Algorithms for Sequential Public Health or Medical Decision Making • A patient presents with certain symptoms. • Which test do we do first? • On the basis of the outcome of the first test, which test do we do next? • Tests are expensive. • So are false positive and false negative results. • “Cost” is a combination of cost of testing and cost of false results. • In what order should we do tests in order to minimize total “cost”?
Algorithms for Sequential Public Health or Medical Decision Making • We have several potential interventions for a • public health crisis. • Assume funds limit us to one intervention at a time. • Which intervention do we invest in first? • On the basis of the outcome of the first intervention, which do we launch next? • Interventions are expensive. • So are false positive and false negative assessments of the outcome of our interventions. • “Cost” is a combination of cost of the intervention and cost of false results. • In what order should we launch the interventions in order to minimize total “cost”?
Such sequential diagnosis problems arise in many areas: • Communication networks (testing connectivity, paging cellular customers, sequencing tasks, …) • Manufacturing (testing machines, fault diagnosis, routing customer service calls, …) • Inspecting containers at ports Sequential Diagnosis Problem
A physician is looking to determine if a patient has disease x. The doctor has a variety of tests to choose from. In the end, the patient is to be classified into one of several categories. • Simple case: 0 = “doesn’t have the disease”, 1 = “does have the disease” • Testing scheme: specifies which tests are to be made based on previous observations Sequential Decision Making Problem Blood test endoscopy MRI Stress test
We are looking to determine if an epidemic can be controlled. We have a variety of interventions to choose from. In the end, the epidemic is to be classified into one of several categories. • Simple case: 0 = controllable, 1 = not controllable • Intervention scheme: specifies which interventions are to be made based on assessments of previous interventions. • H1N1 Virus. • Intervention 1: Close Schools if 15% absenteeism • Intervention 2: Close Airports • Intervention 3: Tamiflu to health care workers • Intervention 4: Invest in vaccine. Sequential Decision Making Problem
0’s and 1’s suggest binary digits (bits) • Bit String: A sequence of bits: • 0001, 1101, … • Boolean Function: A function that assigns to each bit string a 0 or a 1. • Bit String xB(x) • 00 1 • 01 0 B(00) = 1, B(10) = 0 • 10 0 • 11 1 Sequential Decision Making Problem