180 likes | 246 Vues
"Explore the basics of Genetic Algorithm, a machine learning technique inspired by natural evolution. Learn about reproduction, encoding, fitness evaluation, GA procedure, and search methods."
E N D
Intro to AIGenetic Algorithm Ruth Bergman Fall 2004
Imitating Nature Aspect of the evolution of organisms: • The organisms that are ill-suited for an environment have little chances to reproduce (natural selection) • Conversely, the best fitting have more chances to survive and reproduce
Imitating Nature Reproduction: • Offspring are similar to their parents • Random mutations occur and they can bring to better (or worse) fitting individuals “The Origin of the Species on the Basis of Natural Selection” C. Darwin (1859) Encoding: • An organism is fully represented by its DNA string, that is a string over a finite alphabet (4 symbols) • Each element of this string is called gene
Genetic Algorithm (GA) • Developed by John Holland in the early 70’s • Optimization and machine learning techniques inspired from the process of natural evolution and evolutionary genetics • Solutions are encoded as chromosomes • Search proceeds through maintenance of a population of solutions • Reproduction favors “better” chromosomes • New chromosomes are generated during reproduction through processes of mutation and cross over, etc.
A B C D 1 1 1 1 0 0 1 1 0 1 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 1 1 0 1 1 population 1 0 0 1 1 selection 0 1 1 1 0 Fitness evaluation 1 0 0 1 1 0 1 1 1 0 GA Framework cross over Search space mutation reproduction
GA Procedure • Start with a population of N individuals • Apply the fitness function to all the individuals • Randomly select the N/2 pairs of individuals for reproduction (repetition allowed). • Each pair generates two children (reproduction with cross-over) • Apply a random mutation to the children with small probability. The children become the next generation • Apply steps 1,2,3 until some termination criteria applies
0 1 1 1 1 1 1 0 h = = 126 Encoding Scheme • An individual (an organisms) is intended to be a possible solution for the problem you want to solve • An individual is represented by a binary string. Such a string is intended to be the complete description of the individual • Example: Suppose you have to find a number between 0 and 255, which binary representation contains the same number of 1s and 0s. A individual is a string of 8 bits, ex:
Fitness Function • A fitness function is a function that says how good is a solution, i.e. how well an individual fit the environment • Example Where n1 is the number of 1’s in h and n0 is the number of 0’s in h. note that the fitness function gets the minimum value (i.e. 0) when n1 = 8 or n0 = 8 and the maximum value (i.e. 8) when n1 = n0 = 4
Selection • Roulette wheel selection • compute each individual’s contribution to the global fitness as • The choice of the pairs for reproduction consists of randomly choosing the individuals (with replacement) with distribution given by P Roulette Wheel
0 1 1 1 1 1 1 0 0 1 1 1 1 1 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 1 1 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 Crossover • Randomly choose a cross over point “c”, i.e. a number between 1 and n • return two children: one composed by the first c bits of the first parent and the last n-c bits of the second parent, the other composed by the first c bits of the second parent and the n-c bits of the first parents c
1 1 1 1 1 1 0 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 0 1 1 0 Mutation • mutation on individuals: some of the children’s bits are changed (with a small, independent probability) maximum found
Stopping Criteria • Convergence: • A population is said to converge when all the genes have converged, I.e. when the value of every bit is the same at least in the 95% of the individuals in the population • Since convergence is not guaranteed, we must consider other stopping criteria: • Number of generations • Almost constant value of the best fitting individual • Almost constant value of the average fitness of the population
Parameter Settings • Population size • How many chromosomes are in population • Too few chromosome small part of search space • Too many chromosome GA slow down • Recommendation : 20-30, 50-100 • Probability of crossover • How often will crossover be performed • Recommendation : 80% -95% • Probability of mutation • How often will be parts of chromosome mutated • Recommendation : 0.5% - 1%
GA Search Method Hill-climbing Method Optimization Search • Genetic algorithms is a search algorithm • evaluation function≡ fitness function • Similar to beam search with N beams, but • Next generation selected stochastically • Sexual reproduction • Similar to hill-climbing, but • Convergence to global optimum is expected eventually cf.
Genetic Programming • One of the central challenges of CS is to get a computer to do what needs to be done, without telling it how to do it • Automatic programming (or program synthesis) • GP is a branch of genetic algorithms • Main difference between GP and GA • Representation of the solution (computer program) • GA: a string of numbers • fixed-length character strings • GP: computer program (lisp or scheme) • Represent hierarchical computer programs of dynamically varying sizes and shapes
Evonomy • Evonomy brings advanced artificial intelligence and financial markets together. It generates an enormous number of different trading approaches and selects the one with the best profit/risk combination. • Evonomy is based on a genetic algorithm that applies the powerful mechanisms found in biological evolution to financial markets. (1) The system creates a large population of virtual traders each of which has its own recipe for beating the market. (2) The system tests the traders against the relevant historical market data. (3) The traders with the best profit-risk profiles survive and have offspring: the new generation of traders is created by mating and mutating the most promising recipes. (4) The system runs the battle for survival as long as some traders stand out and dominate the population. These traders have the optimal profit/risk profile; any change in their recipe would make them less fit.