1 / 11

CS 478 - Machine Learning

CS 478 - Machine Learning. Genetic Algorithms (II). Schema (I). A schema H is a string from the extended alphabet {0, 1, *}, where * stands for “don't-care” (i.e., a wild card) A schema represents or matches a number of strings:. There are 3 L schemata over strings of length L.

grayd
Télécharger la présentation

CS 478 - Machine Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 478 - Machine Learning Genetic Algorithms (II)

  2. Schema (I) • A schema H is a string from the extended alphabet {0, 1, *}, where * stands for “don't-care” (i.e., a wild card) • A schema represents or matches a number of strings: • There are 3L schemata over strings of length L CS 478 - Machine Learning

  3. Schema (II) • Since each position in a string may take on either its actual value or a *, each binary string in a GA population contains, or is a representative of, 2L schemata • Hence, a population with n members contains between 2L and min(n2L, 3L) schemata, depending on population diversity. (The upper bound is not strictly n2L as there are a maximum of 3L schemata) • Geometrically, strings of length L can be viewed as points in a discrete L-dimensional space (i.e., the vertices of hypercubes). Then, schemata can be viewed as hyperplanes (i.e., hyper-edges and hyper-faces of hypercubes) CS 478 - Machine Learning

  4. Schema Order • The order of a schema H is the number of non * symbols in H • It is denoted by o(H): • A schema of order o over strings of length L represents 2L-o strings CS 478 - Machine Learning

  5. Schema Defining Length • The defining length of a schema H is the distance the first and last non * symbols in H • It is denoted by (H): CS 478 - Machine Learning

  6. Intuitive Approach • Schemata encode useful/promising characteristics found in the population. • What do selection, crossover and mutation do to schemata? • Since more highly fit strings have higher probability of selection, on average an ever-increasing number of samples is given to the observed best schemata. • Crossover cuts strings at arbitrary sites and swaps. Crossover leaves a schema unscathed if it does not cut the schema, but it may disrupt a schema when it does. For example, 1***0 is more likely to be disrupted than **11* is. In general, schemata of short defining length are unaltered by crossover. • Mutation at normal, low rates does not disrupt a particular schema very frequently. CS 478 - Machine Learning

  7. Intuitive Conclusion Highly-fit, short-defining-length schemata (called building blocks) are propagated generation to generation by giving exponentially increasing samples to the observed best • …And all this takes place in parallel, with no memory other than the population. This parallelism as been termed implicit as n strings of length L actually allow min(n2L, 3L) schemata to be processed. CS 478 - Machine Learning

  8. Formal Account • See the PDF document containing a formal account of the effect of selection, crossover and mutation, culminating in the Schema Theorem. CS 478 - Machine Learning

  9. Prototypical Steady-state GA • P p randomly generated hypotheses • For each h in P, compute fitness(h) • While maxhfitness(h) < threshold (*) • Ps  Select r.p individuals from P (e.g., FPS, RS, tournament) • Apply crossover to random pairs in Ps and add all offspring to Po • Select m% of the individuals in Po with uniform probability and apply mutation (i.e., flip one of their bits at random) • Pw  r.p weakest individuals in P • P  P – Pw + Po • For each h in P, compute fitness(h) CS 478 - Machine Learning

  10. Influence of Learning • Baldwinian evolution: learned behaviour causes changes only to the fitness landscape • Lamarckian evolution: learned behaviour also causes changes to the parents' genotypes • Example: • … calculating fitness involves two steps, namely k-means clustering and NAP classification. The effect of k-means clustering is to refine the starting positions of the centroids to more “representative” final positions. At the individual's level, this may be viewed as a form of learning, since NAP classification based on the final centroids' positions is most likely to yield better results than NAP classification based on their starting positions. Hence, through k-means clustering, an individual improves its performance. As fitness is computed after learning, GA-RBF makes implicit use of the Baldwin effect. (Here, we view the result of k-means clustering, namely the improved positions of the centroids, as the learned “traits”). A straightforward way of implementing Lamarckian evolution consists of coding the new centroids’ positions back onto the chromosomes of the individuals of the current generation, prior to genetic recombination. CS 478 - Machine Learning

  11. Conclusion • Genetic algorithms are used primarily for: • Optimization problems (e.g., TSP) • Hybrid systems (e.g., NN evolution) • Artificial life • Learning in classifier systems CS 478 - Machine Learning

More Related