1 / 58

Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms. Kfir Wolfson Moshe Sipper. Agenda. Introduction Evolutionary Setup Results Less Knowledge – More Automation Related Work Conclusions and Future Work. Introduction. Algorithm design is important task in CS

cwhitlock
Télécharger la présentation

Evolving Efficient List Search Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evolving EfficientList Search Algorithms Kfir Wolfson Moshe Sipper Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  2. Agenda • Introduction • Evolutionary Setup • Results • Less Knowledge – More Automation • Related Work • Conclusions and Future Work Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  3. Introduction • Algorithm design is important task in CS • Evolutionary algorithms have been applied to many areas, but limited research on software engineering and algorithmic design • We introduce the notion “Algorithmic design through Darwinian evolution” • Begin with a benchmark case – List Search Algorithms: • Can evolution be applied to finding a search algorithm? • Can evolution be applied to finding an efficient search algorithm? • We employ Genetic Programming (GP) to the task and show the answer to both questions is affirmative Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  4. Agenda • Introduction • Evolutionary Setup • Results • Less Knowledge – More Automation • Related Work • Conclusions and Future Work Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  5. Evolutionary Setup • Representation • Phenotype • Genotype • GP Parameters • Fitness Function • GP Operators Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  6. Representation • Phenotype • Array search algorithm • Searches for a key in a 1-dimentional array Java static function: publicstaticint search(int[] arr, int KEY) { int n = arr.length; int M0 = 0; int M1 = n-1; int INDEX = 0; for (int ITER = 0; ITER < iterations; ITER++) { -> PLUG IN EVOLVING GENOTYPE HERE <- } return INDEX; } Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  7. Representation publicstaticint search(int[] arr, int KEY) { int n = arr.length; int M0 = 0; int M1 = n-1; int INDEX = 0; for (int ITER = 0; ITER < iterations; ITER++) { -> PLUG IN EVOLVING GENOTYPE HERE <- } return INDEX; } Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  8. global variables • Set to: • n for linear search • log2 n for sublinear Representation publicstaticint search(int[] arr, int KEY) { int n = arr.length; int M0 = 0; int M1 = n-1; int INDEX = 0; for (int ITER = 0; ITER < iterations; ITER++) { -> PLUG IN EVOLVING GENOTYPE HERE <- } return INDEX; } Array index returned (might be “illegal”) Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  9. If INDEX:= = NOP Array [INDEX] KEY ITER Representation • Genotype • Koza-style genetic programming • Evaluation trees • Strongly typed • More understandable algorithms • Function and Terminal sets • Same for evolution of both linear andsublinear search algorithms Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  10. Representation publicstaticint search(int[] arr, int KEY) { int n = arr.length; int M0 = 0; int M1 = n-1; int INDEX = 0; for (int ITER = 0; ITER < iterations; ITER++) { -> PLUG IN EVOLVING GENOTYPE HERE <- } return INDEX; } -> PLUG IN EVOLVING GENOTYPE HERE <- Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  11. Representation Array = KEY = 18 -> PLUG IN EVOLVING GENOTYPE HERE <- Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  12. Representation -> PLUG IN EVOLVING GENOTYPE HERE <- Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  13. Representation -> PLUG IN EVOLVING GENOTYPE HERE <- • The [M0+M1]/2 terminal • Embodies human intuition about the problem to facilitate the solution • Still requires crucial algorithmic insight to be derived via evolution • Later we re-examine this terminal, repealing it altogether. Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  14. If Equivalent Java: INDEX:= = NOP if (arr[INDEX] == KEY) ; else INDEX = ITER; Array [INDEX] KEY ITER Representation - Example • An example correct solution to linear search problem: LISP: (If (= Array[INDEX] KEY) NOP INDEX:= ITER))) Let’s plug into the phenotype frame… Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  15. Representation - Example • An example correct solution to linear search problem: publicstaticint search(int[] arr, int KEY) { int n = arr.length; int M0 = 0; int M1 = n-1; int INDEX = 0; for (int ITER = 0; ITER < iterations; ITER++) { if (arr[INDEX] == KEY) ; else INDEX = ITER; } return INDEX; } Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  16. Representation int search(int[] arr, int KEY) { int n = arr.length; int M0 = 0; int M1 = n-1; int INDEX = 0; for (int ITER=0; ITER < iterations; ITER++) { -> PLUG IN GENOTYPE HERE <- } return INDEX; } • search call: • Always halts • No loop functions • Only read access to ITER • Number of iterations is limited • Inherently deals with keys not in the array • With wrapper function • No early termination when key is found • Harder problem:Evolved algorithm will have to learn to retain correct index. Why? Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  17. Evolutionary Setup • Representation • Phenotype • Genotype • GP Parameters • Fitness Function • GP Operators Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  18. Fitness Function • How do we rate a solution? • Present the individual with many random input arrays • Use search method to search for all keys in all arrays • Reward individual for closeness of returned indexes • Training set includes arrays of all sizes in [minN, maxN] • Array of size n contains: • Linear case: random permutation of [1000, 1000+n-1] • Sublinear case: sorted unique numbers from [n, 100n] • Note key range disjoint from index range • Discourage “cheating” minN=2 … maxN=100 Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  19. error=2 Fitness Function • Define error per single key search as the distance between the correct index of KEY and the index returned by search(arr,KEY) • Elements are unique • No ambiguity in error definition key = 18 arr = correct search(arr,key) Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  20. Fitness Function • Define hit as the finding of the precise location of KEY key = 18 arr = Hit ! correct search(arr,key) error=0 Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  21. Fitness Function • The fitness value of an individual is defined as: • This gives a 0.5% bonus reduction for every 1% of correct hits • For example, if an individual scored 300 hits in 1000 search calls, its fitness will be the average error per call, reduced by 15% • This bonus • encourages perfect answers (“almost” is bad…), • increases fitness variation in population Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  22. Generality Test • The best solution of each run was subjected to a stringent generality test, by running it on random arrays of all lengths in the range [2, 5000] ([2, 500] for linear case). • Kinnear (1993) noted that: “For any algorithm... that operates on an infinite domain of data, no amount of testing can ever establish generality. Testing can only increase confidence.” • We included analysis by hand for selected solutions. Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  23. GP Operators and Parameters Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  24. Agenda • Introduction • Evolutionary Setup • Results • Less Knowledge – More Automation • Related Work • Conclusions and Future Work Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  25. Results - Linear • It turned out that evolving a linear-time search algorithm was quite easy with the function and terminal sets we designed. • 46 out of 50 runs (92%) produced perfect solutions, passing the generality testing of arrays up to length 500. • Our representation rendered the problem easy enough for a perfect individual to appear in the randomly generated generation 0 in three of the runs. • Search space was small enough for random search. Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  26. Equivalent Java: if (arr[INDEX] == KEY) M1 = (M0+M1)/2; else INDEX = ITER; If INDEX:= = M1:= Array [INDEX] KEY ITER [M0+M1]/2 Results - Linear • An example evolved solution: LISP: (If (= Array[INDEX] KEY) (M1:= [M0+M1]/2) INDEX:= ITER))) Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  27. Irrelevant but does not effect output index Results - Linear • An example evolved solution: publicstaticint search(int[] arr, int KEY) { int n = arr.length; int M0 = 0; int M1 = n-1; int INDEX = 0; for (int ITER = 0; ITER < iterations; ITER++) { if (arr[INDEX] == KEY) M1 = (M0+M1)/2; else INDEX = ITER; } return INDEX; } Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  28. Sublinear Search • We set iterationsto log2n,and proceeded to evolve sublinear search algorithms. publicstaticint search(int[] arr, int KEY) { int n = arr.length; int M0 = 0; int M1 = n-1; int INDEX = 0; for (int ITER = 0; ITER < iterations; ITER++) { -> PLUG IN EVOLVING GENOTYPE HERE <- } return INDEX; } Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  29. Results - Sublinear • Unsurprisingly, this case proved a harder problem, but it was also solved by the evolution. • 35 out of 50 runs (70%) produced perfect solutions, passing the generality testing of arrays up to length 5,000. • Solutions emerged between generation22 and 3,632 • Solution sizes varied between 42 and 244 nodes • Runtime: between 2 hours and 2 days on CS grid • 7 runs (14%) produced near-perfect solutions, which failed on a single key in the input arrays (99.96% hits on the generality test) Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  30. Results – Sublinear • An example simplified evolved solution: LISP: Equivalent Java: • Simplified by hand from a tree of 50 nodes down to 14 (PROGN2 (INDEX:= [M0+M1]/2) (If (> KEY Array[INDEX]) (PROGN2 (M0:= [M0+M1]/2) (INDEX:= M1)) (M1:= [M0+M1]/2)))) INDEX = (M0+M1)/2 ; if (KEY > arr[INDEX]){ M0 = (M0+M1)/2 ; INDEX = M1; } else M1 = (M0+M1)/2 ; Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  31. Results - Sublinear publicstaticint search(int[] arr, int KEY) { int n = arr.length; int M0 = 0; int M1 = n-1; int INDEX = 0; for (int ITER = 0; ITER < iterations; ITER++) { INDEX = (M0+M1)/2 ; if (KEY > arr[INDEX]){ M0 = (M0+M1)/2 ; INDEX = M1; } else M1 = (M0+M1)/2 ; } return INDEX; } This is a form ofBinary Search(with a small twist) Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  32. Agenda • Introduction • Evolutionary Setup • Results • Less Knowledge – More Automation • Related Work • Conclusions and Future Work Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  33. Less Knowledge – More Automation • Re-examining representation: • Most terminals and functions are either • General-purpose or • Problem-specific • However, one terminal stands out: [M0+M1]/2 • Solution-specific • We proceed to • Remove [M0+M1]/2 terminal • Add an automatically defined function (ADF) Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  34. Adding ADF PROGN2 PROGN2 PROGN2 INDEX:= INDEX:= INDEX:= INDEX Array [INDEX] Array [INDEX] KEY KEY ITER M0:= M0:= M1:= M1:= M1 M1 M0 TRUE FALSE NOP [M0+M1]/2 If If [M0+M1]/2 [M0+M1]/2 [M0+M1]/2 ADF0 < = > > Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  35. Adding ADF PROGN2 PROGN2 PROGN2 ADF0 INDEX:= INDEX:= INDEX:= INDEX Array [INDEX] Array [INDEX] KEY KEY ITER M0:= M0:= M1:= M1:= M1 M1 M0 ADF Functions & Terminals TRUE FALSE NOP TRUE If If ADF0 ADF0 ADF0 + + / / 0 1 1 M0 M0 < = > > - - * * 2 M1 M1 Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  36. GP Parameters – with ADF • Array size was increased to: • minN = 200 • maxN = 300 • To avoid non-general solutions • example to follow • Different function set for main and ADF trees • Crossover is tree-wise • Mutation performed better than crossover • Especially for ADF tree Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  37. GP Parameters – with ADF • Same as previous setup, with the following changes: Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  38. Results – Sublinear with ADF • The sublinear search problem with an ADF naturally proved more difficult than with the [M0+M1]/2 terminal • 12 out of 50 runs (24%) produced perfect solutions, passing the generality testing of arrays up to length 5,000 (later passed all test up to size 20,000) • Solutions emerged between generation54 and 4,557 • Solution sizes varied between 53 and 244 nodes • Runtime: between 4 hours and 2 weeks on CS grid • An additional run produced a non-standard solution – will be discussed later Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  39. Results – Sublinear with ADF • Analysis revealed all perfect solutions to be variations of binary search • The algorithmic idea can be deduced by inspecting the ADFs, all of which turned out to be equivalent to one of the following (all fractions truncated): which are reminiscent of the [M0+M1]/2 terminal we dropped (M0+M1)/2 (M0+M1+1)/2 M0/2+(M1+1)/2 Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  40. Results – Sublinear with ADF • An example simplified evolved solution: LISP: Equivalent Java: • Simplified by hand from a tree of 58 nodes down to 26 (PROGN2 (PROGN2 (if (< Array[INDEX] KEY) (INDEX:= ADF0) NOP) (if (< Array[INDEX] KEY) (M0:= INDEX) (M1:= INDEX))) (INDEX:= ADF0))) ADF0: (/ (+ (+ 1 M0) M1) 2) if (arr[INDEX] < KEY) INDEX = ((1+M0)+M1)/2; if (arr[INDEX] < KEY) M0 = INDEX; else M1 = INDEX; INDEX = ((1+M0)+M1)/2; (Before simplification: slide 60) Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  41. Results – Sublinear with ADF publicstaticint search(int[] arr, int KEY) { int n = arr.length; int M0 = 0; int M1 = n-1; int INDEX = 0; for (int ITER = 0; ITER < iterations; ITER++) { if (arr[INDEX] < KEY) INDEX = ((1+M0)+M1)/2; if (arr[INDEX] < KEY) M0 = INDEX; else M1 = INDEX; INDEX = ((1+M0)+M1)/2; } return INDEX; } This is another form ofBinary Search(with a different twist) Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  42. Interesting Results • Interesting to mention some of the other evolved solutions • With minN=2, maxN=100 and main-tree max-depth = 17 linearsearch algorithms had evolved, failing on longer arrays • How is this possible (in log2n iterations)? • An O(logn) solution has a constant factor, i.e. algorithm does klogn operations. • We set a limit to number of iterations, where each iteration the full genotype code is executed. • A linear search could evolve, by taking advantage of the constant factor k Skip to next solution Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  43. If key is found, do nothing else increment INDEX by 1 Interesting Results • Linear solution ADF: ADF0=(M0+1) • Main tree included 16 occurrences of: • For array of size n=100: • logn=7, for k=16: klogn=167>100 (enough to traverse all the array) • We proceeded to • increase minN, maxN (to 200, 300), • decrease maximum k, by lowering max-depth to 10 (If (= Array[INDEX] KEY) NOP (PROGN2 (M0:= ADF0) (INDEX:= M0))) Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  44. 256 …. 4 8 2 1 Interesting Results • One more interesting solution has evolved • Returns correct results (100% hits) up to array length ~6,640 • Analyzing it revealed an interesting algorithm which makes a series of jumps in exponentially increasing size • in the form of 2i from 1 to 256 every iteration • Thus was able to handle array sizes n such that (roughly), • n ≤ 512 x log2n  n ≤ 6656 Skip Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  45. M1  2*M1-M0-1 256 …. 4 8 2 1 Interesting Results • ADF0 = 2*M1-M0-1 • Main tree included 7-8 occurrences similar to: • Difference grows by factor of 2 (PROGN2 (if (> Array[INDEX] KEY) (M1:= ADF0) NOP) (INDEX:= ADF0)) M1’ 2*M1 -M0-1 M1’’  2*M1’-M0-1 ------------------ M1’’-M1’ = 2(M1’-M1) Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  46. Agenda • Introduction • Evolutionary Setup • Results • Less Knowledge – More Automation • Related Work • Conclusions and Future Work Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  47. Related Work • No previous work on evolving list search algorithms • “Closest”: sorting algorithms • Loosely related – in both cases, solutions have to be 100% correct • We found 10-15 works on evolving sorting algorithms • Most works have been able to evolve O(n2) sorting algorithms • One work evolved an O(nlogn) algorithm • albeit with a highly specific setup Skip Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  48. Related Work • Kinnear (1993) evolved an O(n2) bubble sort using koza-style GP. • He tried a number of function sets, all quite specific, including • double-for, swap, who-is-bigger functions • Showed that the difficulty in evolving a solution increases as the functions become less problem-specific • (order x y) vs. (if-lt x y work) and (swap x y) • Noted that adding parsimony increased likelihood of evolving a general solution Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  49. Related Work • Withall et. al. (2009) developed a new GP representation • fixed-length blocks of genes, representing single program statements. • The phenotype is a PERL program. • They showed improvement over previous linear-GP representations, in similarity between child and parent, i.e, propagation of characteristics (building blocks) through multiple generations. • A number of list algorithms were evolved • sum-of-elements, max-element, reverse, sort • using problem-specific functions for each algorithm • Functions included • for loop function • double function – a highly specific double-for nested loop. • With these specialized structures they evolved an O(n2) bubble sort algorithm. Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

  50. Related Work • An O(nlogn) solution was evolved by Agapitos et. al. (2006-7) • The evolutionary setup was based on their object-oriented genetic programming system. • They compared five different fitness functions based on various measures of array disorder. • To avoid non-terminating programs they defined an upper bound on recursive calls, based on their hand-coded implementation. Evolutionary Computation and Aritficial Life (ECAL) cousre - CS BGU - July 8th, 2009

More Related