230 likes | 364 Vues
This research investigates the application of Genetic Programming (GP) to uncover intriguing relationships in mathematical data. Focusing on Euler's identity (V - E + F = 2), the study demonstrates how GP can derive this fundamental equation from limited datasets, such as the properties of cubes, tetrahedra, and octahedra. Over 50 generations and a population of 4000 candidate solutions, the research employs crossover techniques in a fitness-driven search space to efficiently converge on solutions, illustrating the potential of machine learning methods in mathematical discovery.
E N D
Automated discovery in math • Machine learning techniques (GP, ILP, etc.) have been successfully applied in science • How about mathematics? Can they be used to discover interesting relationships in mathematical “data”? • This is an exploration of using GP for that purpose • Specifically, using GP to automatically discover Euler’s identity (V – E + F = 2) from a fairly limited amount of data
Cubes V = 8 E = 12 F = 6 V – E + F = 8 – 12 + 6 = 2
Tetrahedra V = 4 E = 6 F = 4 V – E + F = 4 – 6 + 4 = 2
Octahedra V = 6 E = 12 F = 8 V – E + F = 6 – 8 + 12 = 2
At a glance • 50 generations • Population: 4000 ASTs • Generation #: 3600 (90% of population) • Maximum AST depth: 13 • Ramped half-and-half initialization • 3 non-terminals: +, -, * • 12 terminals: V, E, F, 1, 2, …, 9 • Crossover, no mutation
Genetic algorithms (GA) • Search a space of solution attempts (“individuals”) • Use natural selection to guide the search • Must have a fitness function that can evaluate any given individual • Individuals procreate by exchanging (recombining) “genetic material”
Example: SAT solving • Problem: Given a CNF formula P over n variables x1,…,xn, find a satisfying assignment • Search space: all n-bit strings • Fitness measure for a given individual b1 bn: # of satisfied clauses in P • Genetic operations: crossover and mutation
Crossover: a1 … aj-1|aj … an + b1 … bj-1|bj … bn a1 … aj-1| bj … bn b1 … bj-1| aj … an Mutation: 0 1 1 0 1 0 0 1 0 1 1 0 0 0 0 1
Generic GA algorithm Parameterized over: N, P, G • Construct a random initial population • Set i := 1 • If i > N then halt • Compute the fitness of each individual; if the fittest solves the problem, halt. • Create a new population: • Pick P – G individuals and copy them • Create G new individuals by repeated applications of genetic operations • Set i := i + 1 and go to step 3
Selection • How is an individual “picked” for reproduction or copying? • Main idea: the probability that an individual is selected should be proportional to the individual’s fitness • Many ways to ensure that. One method is tournament selection: • Pick 0 < k <= P individuals randomly • Select the fittest of the k • When k = 1: No selection pressure • When k = P: Too much selection pressure
Genetic Programming (GP) • An instance of the generic GA scheme • Individuals are now programs, i.e., syntactic objects • Search space is kept finite by bounding program size • Programs are represented as ASTs (abstract syntax trees)
Programs as ASTs if x > 0 then y := x * x else y := z + 1 Parsing if := := > + x y y 0 * x 1 x z
Program structure in GP • Programs are usually simple Herbrand terms, i.e., functional expressions • AST leaves are called terminals • Internal nodes are non-terminals • Non-terminals are function symbols (e.g. +) • Terminals are constants and variables • Terminals + non-terminals must be sufficient for expressing solutions
Viewing a functional AST as a “program” + * y x 2 The program has two “inputs”, x and y. Given specific values for these, it produces a unique result as output
AST Crossover Crossover pt 1 Crossover pt 2 + - + T4 * T3 T1 T2 T5 T6 Parents Children - + T4 * + T3 T1 T5 T6 T2
Initial population • Built randomly • Two methods for building a random AST: • Full method: All branches are equally long • Grow method: Different subtrees can have different sizes (but less than the maximum) • More usual: ramped half-and-half initialization: half of the trees are built with one method, the other half with the other method
Problem formulation • Can cast it as a standard symbolic regression problem • View F as a function of E and V, and search space of all rational functions of two variables (up to a max depth) • Error function: difference between actual # of faces and the result produced by the program • Optimization: minimize the error • Quick convergence
Another approach • Search space of all identities • Generated as follows: I T1 = T2 T L | T1 + T2 | T1 – T2 | T1 * T2 L V | E | F | 1 | 2 | … | 9 • Any other integer can be built from 1,…, 9 and the given non-terminals • Identity is not a non-terminal; it can only appear at the root of an AST
Details • Generate P identities randomly (using ramped half-and-half initialization) • Crossover on two identities S1 = S2 and T1 = T2: • Mate two random subterms Si and Tj from each identity, producing two new subterms Si’ and Tj’ • If either new term is deeper than the max depth, then use one of the original parents • Replace Si and Tj in the identities by Si’ and Tj’ • No mutation
Fitness • An identity is evaluated on a given triple of values for V, E, and F • Computing the fitness of an identity S = T: • For each of the k data triples ½: • If S = T holds for ½, then give the identity a point • Higher score, greater fitness • Maximum fitness: 9, minimum: 0
Problem • Trivially true identities can get perfect scores, e.g.: • V = V • 1 + 2 = 5 – 3 • E – E + E = E • Solution: negative triples, e.g.: • V = 0, E = 0, F = 1 • Trivial identities will hold for such negative triples, but plausible identities will not
Fitness computation • To evaluate an identity S = T: • For each of the k data triples p: • Allocate a point if S = T holds for p • Allocate a second point if S = T does not hold for the negative triple • Maximum score: 18, minimum: 0 • Also impose a penalty of b n/20 c points for an identity of length n (to discourage excessively long expressions)