1 / 52

creativecommons/licenses/by-sa/2.0/

http://creativecommons.org/licenses/by-sa/2.0/. CIS786 Lecture 1. Usman Roshan. Complexity. NP: class of decision problems whose solution can be verified in polynomial time P: class of decision problems whose solution can be determined in polynomial time TSP (decision version)

yakov
Télécharger la présentation

creativecommons/licenses/by-sa/2.0/

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. http://creativecommons.org/licenses/by-sa/2.0/

  2. CIS786Lecture 1 Usman Roshan

  3. Complexity • NP: class of decision problems whose solution can be verified in polynomial time • P: class of decision problems whose solution can be determined in polynomial time • TSP (decision version) • Problem: is there a Hamiltonian path of length at most k in an edge-weighted directed graph? • Solution can be verified in polynomial time • Can it be answered in polynomial time for arbitrary instances? • Shortest path problem (decision version) • Problem: is there a shortest path of length at most k between vertices u and v in an edge weighted directed graph? • Solution can be verified in polynomial time • It can also be determined in polynomial time!

  4. Complexity • In real life we are concerned with optimization and finding real solutions (as opposed to decision problems) • Decision problems mainly have to do with hardness results---if the decision version is very hard to solve then clearly so is the optimization version • The question of P=NP is the most important in computer science and remains to be answered

  5. Reduction • Matching: given a bipartite graph, does there exist a matching? • Can be solved by reducing to the maximum flow problem s t

  6. NP-hardness • Reduction: problem X can be reduced to problem Y if the following hold • x is a yes to X iff y=R(x) is a yes to Y • R(x) is a polynomial time reduction function • Reduction is about decision problems and not optimization ones • A problem X is NP-complete if • X is in NP • All problems in NP reduce to X

  7. Running time (poly vs exp)

  8. Running time (effect of constants)

  9. Approximation algorithms • Vertex cover: find minimum set of vertices C in G=(V,E) such that for each edge in G at least one of its endpoints is in C • 2-approx algorithm: • Initialize C to the empty set • While there are edges in G • Select any edge (u,v) • Add u and v to C • Delete u and v from G • But approx algorithms don’t work well in practice

  10. Heuristics • No guarantees on quality of solution • Usually very fast and are widely used • Studied experimentally on benchmark datasets • Popular heuristics • Iterative improvement • Randomised iterative improvement • Iterated local search • Tabu search • Genetic algoritms • Simulated annealing • Ant colony optimization

  11. Local search • Let g be the function to optimize • Assume we have a function M(x) which generates a neighborhood of x (preferably of polynomial size) • Iterative improvement • Determine initial candidate solution s • While is not a local optimum • Choose a neighbor s’ of s such that g(s’)<g(s) (can do best or first improvement) • Set s = s’

  12. Global view of search

  13. Local view of search

  14. Highly studied in computer science Problem is to find shortest Hamiltonian in an edge-weighted directed graph NP-hard No polynomial time approximation scheme unless P=NP Travelling Salesman problem

  15. TSP

  16. Greedy TSP search(Nearest Neighborhood search) • Start with a randomly selected vertex • Find neighboring unvisited vertex and it to the path • When no more visited vertices available add the initial vertex to complete the cycle (if desired) • Combine with backtracking

  17. TSP---local move Sort neighboring edges according to increasing weights and select lowest one first

  18. The Lin-Kernighan algorithm for TSP Basic LK local move • Start with a path • Obtain a delta-path by adding edge (v,w) • Break cycle by removing (w,v’) • Cycle can be completed by adding edge (v’,u)

  19. Full LK heuristic • Input: path p • Obtain a delta-path p’ by replacing one edge in p. • If g(p’) < g(p) then set p=p’ and go to step 2 • Else output p Can be interpreted as a sequence of 1-exchange steps that alternate between d-paths and Hamiltonian paths

  20. Local optima • Big problem! • Simple and commonly employed ways to escape local optima • Random restart: begin the search from a different starting point • Non-improving steps: allow selection of candidate solution with worse evaluation function value • Neither of these are guaranteed to always escape local optimal

  21. Local optima • Local optima depend on g and neighborhood function M • Larger neighborhoods induce fewer local optima but can take longer to search • Now we look at improvements over basic IIS that can avoid local optima

  22. Simulated annealing • Create initial candidate solution s • While termination condition not satisfied • Probabilistically choose a neighbor s’ of s using proposal mechanism • If s’ satisfies probabilistic acceptance criterion then s=s’ • Update T according to annealing schedule • T may be constant for some number of iterations or never change

  23. Simulated annealing • Proposal function is usually uniformly random • Acceptance function is normally Metropolis function

  24. Simulated annealing for TSP • Randomly pick a Hamiltonian cycle s • Select neighbor s’ uniformly at random from neighborhood of s • Accept new solution s’ with probability (also known as the Metropolis condition) • Annealing schedule: T=0.95T for n(n-1) steps (geometric schedule) • Terminate when for five successive temperature values no improvement in solution quality

  25. Convergence for SA

  26. Tabu search • Generate initial candidate s • While termination condition not met • Determine set N’ of non-tabu neighbors of s • Choose the best improving solution s’ in N’ • Update tabu table based on s’ • Set s=s’

  27. Tabu search • Usually, the select candidate (s’) is declared tabu for some fixed number of subsequent steps. This means it cannot be chosen until some time has elapsed and therfore allows for wider exploration of the search space. • Later we will look at a tabu search algorithm for protein folding under the 2D lattice model.

  28. Iterated local search • Generate initial candidate solution s • Perform local search on s (for example iterative improvement starting from s) • While termination condition not met • Set r=s • Perform perturbation on s • Perform local search on perturbed s • Based on acceptance criterion, keep s or revert to r

  29. Iterated local search • ILS can be interpreted as walks in the space of local optima • Perturbation is key • Needs to be chosen so that it cannot be undone easily by subsequent local search • It may consist of many perturbation steps • Strong perturbation: more effective escape from local optima but similar drawbacks as random restart • Weak perturbation: short subsequent local search phase but risk of revisiting previous optima • Acceptance criteria: usually either the more recent or the better of two

  30. Iterated local search for TSP • Perturbation: “double-bridge move” = 4-exchange step • Cannot be directly reversed by 2-exchange moves

  31. ILS for TPS • Acceptance criterion: return the better of the two candidate solutions • Known as Iterated Lin-Kernighan (ILK) algorithm • Although very simple, it has been shown to achieve excellent performance and is among the state of the art

  32. Ant colony optimization • Ants communicate via chemicals known as pheromones which are deposited on the ground in the form of trails. • Pheromone trails provide the basis for (stochastic) trail-following behavior underlying, e.g., the collective ability to find the shortest paths between a food source and the nest

  33. ACOs • Initialise pheromone trails • While termination condition is not met • Generate population sp of candidate solutions using randomized constructive search (such as a greedy heuristic) • Perform local search (e.g. iterative improvement) on sp • Update pheromone trails based on sp

  34. ACOs • In each cycle, each ant creates one candidate solution • All pheromone trails are initialized to the same value • Pheromone update typically comprises uniform decrease of all trail levels (evaporation) and increase on some trail levels based on solutions obtained from construction + local search

  35. ACO for TSP (1)

  36. ACO for TSP (2)

  37. ACO for TSP (3) • Termination: after a fixed number of cycles (construction + local search) • Ants can be imagined as walking along edges of given graph (using memory to ensure their tours correspond to Hamiltonian cycles) and depositing pheromone to reinforce edges of tours • Original ACO did not include local search (local search improves performance considerably) • ACO has also been applied to protein folding which we will see later

  38. Evolutionary (genetic) algorithms • Determine initial solution sp • While termination condition not met • Generate set spr of new candidate solutions by recombination • Generate spm of new candidate solutions from spr and sp by mutation • Select new population from candidate solutions in sp, spr, and spm • Pure evolutionary algorithms often lack capability of sufficient search intensification---need to combine with local search

  39. Recombination

  40. Memetic algorithms (genetic local search) • Determine initial solution sp • Perform local search on sp • While termination condition not met • Generate set spr of new candidate solutions by recombination • Peform local search on spr • Generate spm of new candidate solutions from spr and sp by mutation • Perform local search on spm • Select new population from candidate solutions in sp, spr, and spm • Pure evolutionary algorithms often lack capability of sufficient search intensification---need to combine with local search

  41. MA for TSP • A randomized variant of the greedy heuristic (we saw earlier) is used to generate a population • Among the various recombination operators the GX performs the best

  42. GX operator for TSP MA • Copy edges that are common to two parents (fraction of edges to be copied is determined by parameter p1) • Add new short edges not in the parents (again fraction to be added is determined by parameter p2) • Copy edges from parents where edges are ordered according to increasing length---only edges which do not violate TPS constraints are added and a fraction to be added is determined by parameter p3 • If necessary, complete the tour using the randomized greedy heuristic

  43. Protein folding • Lattice model assumes amino acids are of two types: hydrophobic, which are black, and hydrophilic, which are white • They can take on discrete positions only • The energy value of a fold is determined by the number of non-adjacent hydrophobic residues

  44. Protein folding • Finding the optimal fold in the 2D lattice is NP-hard • There are at least an exponential number of possible folds (as demonstrated by the staircase folds) • Iterative improvements means we need a local move

  45. Pull move

  46. Pull move • Theorem 1: The class of pull moves is reversible • Theorem 2: Any protein can be straightened to form a straight line through a sequence of pull moves

  47. Tabu search • Initialize using randomized greedy construction heuristic • Place amino acid i+1 to the LEFT, RIGHT or STRAIGHT of i • Repeat this recursively starting from i=2 • If reach a dead-end either backtrack or abandon and restart search

  48. Tabu search parameters • Each residue is assigned a mobility • Pull moves are performed on residues with high mobility • Mobilities are updated to encourage exploration of the search space • Initially all residues are assigned medium mobility for memSize iterations • Each residue is randomly selected to medium for one iteration with probability noise • Elements with that have been less altered in the past are encouraged movement using a diversity parameter

  49. Results

  50. Optimal fold on 100 residue long protein

More Related