Computer Science CPSC 502 Lecture 4 Local Search (4.8)

Computer Science CPSC 502 Lecture 4 Local Search (4.8)

Lecture Overview • Local Search • Stocastic Local Search (SLS) • Comparing SLS • SLS variants

Local Search: Motivation • Solving CSPs is NP-hard • Search space for many CSPs is huge • Exponential in the number of variables • Even arc consistency with domain splitting is often not enough • Alternative: local search • use algorithms that search the space locally, rather than systematically • Often finds a solution quickly, but are not guaranteed to find a solution if one exists (thus, cannot prove that there is no solution)

Local Search • Idea: • Consider the space of complete assignments of values to variables (all possible worlds) • Neighbours of a current node are similar variable assignments • Move from one node to another according to a function that scores how good each assignment is • Useful method in practice • Best available method for many constraint satisfaction and constraint optimization problems

Local Search Problem: Definition • Definition: Alocal search problem consists of a: • CSP: a set of variables, domains for these variables, and constraints on their joint values. • A node in the search space will be a complete assignment to all of the variables. • Neighbour relation: an edge in the search space will exist • when the neighbour relation holds between a pair of nodes. • Scoring function: h(n), judges cost of a node (want to minimize) e.g., • - the number of constraints violated in node n. • - the cost of a state in an optimization context.

Example • Given the set of variables {V1 ….,Vn }, each with domain Dom(Vi) • The start node is any assignment {V1/ v1,…,Vn/ vn} . • The neighbors of node with assignment • A= {V1/ v1,…,Vn/ vn} • are nodes with assignments that differ from A for one value only

Search Space for Local Search V1 = v1 ,V2 = v1 ,..,Vn = v1 V1 = v2,V2 = v1 ,..,Vn = v1 V1 = v1 ,V2 = vn,..,Vn = v1 V1 = v4,V2 = v1 ,..,Vn = v1 V1 = v4 ,V2 = v2,..,Vn = v1 V1 = v4 ,V2 = v1 ,..,Vn = v2 V1 = v4 ,V2 = v3,..,Vn = v1 • Only the current node is kept in memory at each step. • Very different from the systematic tree search approaches we • have seen so far! • Local search does NOT backtrack!

Iterative Best Improvement • How to determine the neighbor node to be selected? • Iterative Best Improvement: • select the neighbor that optimizes some evaluation function • Evaluation function:h(n): number of constraint violations in state n • Greedy descent: evaluate h(n) for each neighbour, pick the neighbour n with minimal h(n) • Hill climbing: equivalent algorithm for maximization problems • Here: maximize the number of constraints satisfied

Hill Climbing NOTE: Everything that will be said for Hill Climbing is also true for Greedy Descent evaluation/scoring function Current state/possible world X1 (assume both vars have integer domain X2

Example: N-queen as a local search problem CSP: N-queen CSP • One variable per column; domains {1,…,N} => row where the queen in the ith column seats; • Constraints: no two queens in the same row, column or diagonal Neighbour relation: value of a single column differs Scoring function: number of constraint violations (i..e, number of attacks)

h = h = 5 h =

h = 2 h = 5 h = 0

Example: Greedy descent for N-Queen For each column, assign randomly each queen to a row (a number between 1 and N) Repeat • For each column & each number: Evaluate how many constraint violations changing the assignment would yield • Choose the column and number that leads to the fewest violated constraints; change it Until (solved or some stopping criterion is met) Each cell lists h (i.e. #constraints unsatisfied) if you move the queen from that column into the cell

General Local Search Algorithm 1: Procedure Local-Search(V,dom,C) 2: Inputs 3: V: a set of variables 4: dom: a function such that dom(X) is the domain of variable X 5: C: set of constraints to be satisfied 6: Output complete assignment that satisfies the constraints 7: Local 8: A[V] an array of values indexed by V 9: repeat 10: for each variable X do 11: A[X] ←a random value in dom(X); 12: 13: while (stopping criterion not met & A is not a satisfying assignment) 14: Select a variable Y and a value V ∈dom(Y) 15: Set A[Y] ←V 16: 17: if (A is a satisfying assignment) then 18: return A 19: 20: until termination Random initialization Local Search Step Based on local information. E.g., for each neighbour evaluate how many constraints are unsatisfied.Greedy descent: select Y and V to minimize #unsatisfied constraints at each step

Example: N-Queens 5 steps h = 17 h = 1

Which move should we pick in this situation?

The problem of local minima • Which move should we pick in this situation? • Current cost: h=1 • No single move can improve on this • In fact, every single move only makes things worse (h  2) • Locally optimal solution • Since we are minimizing: local minimum

Local minima • Most research in local search concerns effective mechanisms for escaping from local minima • Want to quickly explore many local minima: global minimum is a local minimum, too Evaluation function Evaluation function B State Space (1 variable) State Space (1 variable) Local minima

Lecture Overview • Local Search • Stochastic Local Search (SLS) • Comparing SLS • SLS variants

Stochastic Local Search • We will use greedy steps to find local minima • Move to neighbour with best evaluation function value • We will use randomness to avoid getting trapped in local minima

Stochastic Local Search for CSPs • Start node: random assignment • Goal: assignment with zero unsatisfied constraints • Heuristic function h: number of unsatisfied constraints • Lower values of the function are better • Stochastic local search is a mix of: • Greedy descent: move to neighbor with lowest h • Random walk: take some random steps • Random restart: reassigning values to all variables

General Local Search Algorithm 1: Procedure Local-Search(V,dom,C) 2: Inputs 3: V: a set of variables 4: dom: a function such that dom(X) is the domain of variable X 5: C: set of constraints to be satisfied 6: Output complete assignment that satisfies the constraints 7: Local 8: A[V] an array of values indexed by V 9: repeat 10: for each variable X do 11: A[X] ←a random value in dom(X); 12: 13: while (stopping criterion not met & A is not a satisfying assignment) 14: Select a variable Y and a value V ∈dom(Y) 15: Set A[Y] ←V 16: 17: if (A is a satisfying assignment) then 18: return A 19: 20: until termination Extreme case 1: random sampling.Restart at every step: Stopping criterion is “true” Random restart Extreme case 2: greedy descentOnly restart in local minima: Stopping criterion is “no more improvement in eval. function h” Select variable/value greedily.

Tracing SLS algorithms in AIspace • Let’s look at these “extreme cases” in AIspace: • Greedy Descent • Random Sampling (always selects the assignment at random) • Simple scheduling problem 2 in AIspace:

Greedy descent vs. Random sampling • Greedy descent is • good for finding local minima • bad for exploring new parts of the search space • Random sampling is • good for • bad for

Greedy descent vs. Random sampling • Greedy descent is • good for finding local minima • bad for exploring new parts of the search space • Random sampling is • good for exploring new parts of the search space • bad for finding local minima • A mix of the two can work very well

Greedy Descent + Randomness • Greedy steps • Move to neighbour with best evaluation function value • Next to greedy steps, we can allow for: • Random restart: reassign random values to all variables (i.e. start fresh) • Random steps: move to a random neighbour • Only doing random steps is called “random walk”

Which randomized method would work best in each of the these two search spaces? Evaluation function Evaluation function B A State Space (1 variable) State Space (1 variable)

Which randomized method would work best in each of the these two search spaces? Evaluation function Evaluation function B A State Space (1 variable) State Space (1 variable) Greedy descent with random steps best on B Greedy descent with random restart best on A • But these examples are simplified extreme cases for illustration • in practice, you don’t know what your search space looks like • Usually integrating both kinds of randomization works best

Stochastic Local Search for CSPs: details • Examples of ways to add randomness to local search for a CSP • one stage selection: choose best variable and value assignment: • instead choose a random variable-value pair • two stage selection: first choose a variable (e.g. the one in the most conflicts), then best value • Selecting variables: • Sometimes choose the variable which participates in the largest number of conflicts • Sometimes choose a random variable that participates in some conflict • Sometimes choose a random variable • Selecting values • Sometimes choose the best value for the chosen variable • Sometimes choose a random value for the chosen variable

Random Walk (one-stage) One way to add randomness to local search for a CSP Sometime chose the pair according to the scoring function Sometime chose at random • E.G in 8-queen, given the assignment here • How many neighbors? • Chose …….. • Chose…..

Random Walk (one-stage) One way to add randomness to local search for a CSP Sometime chose the pair according to the scoring function Sometime chose a random one • E.G in 8-queen, given the assignment here • How many neighbors? 56 • Viwith h(Vi) = V 2 • Chose any of the 56 neighbors

0 2 2 3 3 2 3 Random Walk (two-stage selection) • Selecting a variable first, then a value • Selecting variable: alternate among selecting one • that participates in the largest number of conflicts. • A random variable that participates in some conflict. • A random variable • Selecting value: alternate among selecting one • That minimizes # of conflicts • at random Aispace 2 a: Greedy Descent with Min-Conflict Heuristic

0 2 2 3 3 2 3 Random Walk (two-stage selection) • Selecting a variable first, then a value • Selecting variable: alternate among selecting one • that participates in the largest number of conflicts: V5 • A random variable that participates in some conflict: {V4, V5 ,V8} • A random variable • Selecting value: alternate among selecting one • That minimizes # of conflicts: V5= 1 • at random Aispace 2 a: Greedy Descent with Min-Conflict Heuristic

Lecture Overview • Local Search • Stocastic Local Search (SLS) • Comparing SLS • SLS variants

Evaluating SLS algorithms • SLS algorithms are randomized • The time taken until they solve a problem is a random variable • It is entirely normal to have runtime variations of 2 orders of magnitude in repeated runs! • E.g. 0.1 seconds in one run, 10 seconds in the next one • On the same problem instance (only difference: random seed) • Sometimes SLS algorithm doesn’t even terminate at all: stagnation • If an SLS algorithm sometimes stagnates, what is its mean runtime (across many runs)? • Infinity! • In practice, one often counts timeouts as some fixed large value X • Still, summary statistics, such as mean run time or median run time, don't tell the whole story • E.g. would penalize an algorithm that often finds a solution quickly but sometime stagnates

Comparing SLS algorithms • A better way to evaluate empirical performance • Runtime distributions • Perform many runs (e.g. below left: 1000 runs) • Consider the empirical distribution of the runtimes • Sort the empirical runtimes • Compute fraction (probability) of solved runs for each runtime • Rotate graph 90 degrees. E.g. below: longest run took 12 seconds

Comparing runtime distributions x axis: runtime (or number of steps)y axis: proportion (or number) of runs solved in that runtime • Typically use a log scale on the x axis Slow, but does not stagnate Crossover point:if we run longer than 80 steps, green is the best algorithm 57% solved after 80 steps, then stagnate If we run less than 10 steps, red is thebest algorithm 28% solved after 10 steps, then stagnate # of steps Fraction of solved runs, i.e. P(solved within this # of steps/time)

Comparing runtime distributions • Which algorithm has the best median performance? • I.e., which algorithm takes the fewest number of steps to be successful in 50% of the cases? Fraction of solved runs, i.e. P(solved by this # of steps/time) # of steps

Runtime distributions in AIspace • Can look at some of the algorithms and their runtime distributionsin AISpace • There is a related question in Assign 1

SLS limitations • Typically no guarantee to find a solution even if one exists • SLS algorithms can sometimes stagnate • Get caught in one region of the search space and never terminate • Very hard to analyze theoretically • Not able to show that no solution exists • SLS simply won’t terminate • You don’t know whether the problem is infeasible or the algorithm has stagnated

SLS pros: anytime algorithms • When do you stop? • When you know the solution found is optimal (e.g. no constraint violations) • Or when you’re out of time: you have to act NOW • Anytime algorithm: • maintain the node with best h found so far (the “incumbent”) • given more time, can improve its incumbent

SLS pros: dynamically changing problems • The problem may change over time • Particularly important in scheduling • E.g., schedule for airline: • Thousands of flights and thousands of personnel assignments • A storm can render the schedule infeasible • Goal: Repair the schedule with minimum number of changes • Often easy for SLS starting from the current schedule • Other techniques usually: • Require more time • Might find solution requiring many more changes

Successful application of SLS • Scheduling of Hubble Space Telescope: reducing time to schedule 3 weeks of observations: • from one week to around 10 sec.

Lecture Overview • Recap of Lecture 16 • SLS variants • Evaluating SLS algorithms • Advanced SLS algorithms

Simulated Annealing • Key idea: Change the degree of randomness…. • Annealing: a metallurgical process where metals are hardened by being slowly cooled. • Analogy: start with a high ``temperature'‘. i.e., a high tendency to take random steps • Over time, cool down: more likely to follow the scoring function • Temperature reduces over time, according to an annealing schedule

Simulated Annealing: algorithm • Here's how it works • You are at a node n. Pick a variable at random and a new value at random. You generate n' • If it is an improvement i.e., h(n) – h(n’) > 0 , adopt it. • If it isn't an improvement, adopt it probabilistically depending on • the difference h(n) – h(n’) and • a temperature parameter, T. move to n' with probability

Simulated Annealing: algorithm • if there is no improvement, move to n' with probability • The higher T, the higher the probability of selecting a non improving node for the same h(n) – h(n’) • The higher |h(n) – h(n’)|, the more negative • The lower the probability of selecting a non improving node for the same T h(n)-h(n’)

Simulated Annealing: algorithm • The higher T, the higher the probability of selecting a non improving node for the same h(n) – h(n’) • The higher |h(n) – h(n’)|, the more negative • The lower the probability of selecting a non improving node for the same T h(n’)

Properties of simulated annealing search • If T decreases slowly enough, then simulated annealing search will find a global optimum with probability approaching 1 • - But “slowly enough” usually means longer than the time required by an exhaustive search of the assignment space  • Finding a good annealing schedule is “an art” and very much problem dependent • Widely used in VLSI layout, airline scheduling, etc.

Computer Science CPSC 502 Lecture 4 Local Search (4.8)

Computer Science CPSC 502 Lecture 4 Local Search (4.8)

Presentation Transcript

CPSC 481 Computer Science Computer Engineering Update

Computer Science Computer Engineering CPSC Update

CPSC 502, Lecture 4

Computer Science CPSC 502 Artificial Intelligence 1 Cristina Conati

Computer Science CPSC 502 Lecture 15 Machine Learning: Intro and Supervised Classification

Computer Science CPSC 502 Uncertainty Probability and Bayesian Networks (Ch. 6)

Computer Science CPSC 502 Lecture 3 Constraint Satisfaction Problems (Ch. 4 )

Computer Science Search

Computer Science CPSC 502 Lecture 6 Finish Planning Start Logic ( Ch 5)

Computer Science CPSC 502 Lecture 14 Partially Observable Markov Decision Processes

Computer Science CPSC 502 Lecture 2 Search ( textbook Ch: 3)

Computer Science CPSC 502 Lecture 11 Probability and Time (Ch. 6.5)

Computer Science CPSC 502 Lecture 16 Decision Trees

Computer Science CPSC 502 Lecture 5 SLS: wrap-up Planning (Ch 8)

Stochastic Local Search Computer Science cpsc322, Lecture 15 (Textbook Chpt 4.8) Oct, 9, 2013

Computer Science CPSC 502 Lecture 14 Markov Decision Processes (Ch. 9, up to 9.5.3)

Computer Science CPSC 502 Lecture 9 (up-to Ch. 6.4.2.4)

Local Search

Computer Science CPSC 502 Lecture 12 Decisions Under Uncertainty (Ch. 9, up to 9.3)

Domai n Splitting, Local Search

Stochastic Local Search Algorithms

Local Search