INTRODUCTION • EA’s are utilized in areas such as: Connection Weight Training Architecture Design Learning Rule Adaptation Input Feature Selection Connection Weight Initialization rule extraction from ANN’s
Artificial Neural Network Review An ANN’s consist of a set of processing elements, also referred to as nodes or neurons that interconnect. There are two categories that a ANN’s could fall under Feed-Forward Recurrent
Artificial Neural Network Review (contd.) Bias • Neuron • 3-4-2 Network
Learning in Artificial Neural Networks Learning in ANN’s usually accomplished with training. Training refers to adjusting the weights in ANN’s iteratively so that trained (or learned) ANN’s can perform certain tasks Types of learning in ANN’s: Supervised Unsupervised Reinforcement A popular learning technique called Back-propagation (BP) is an example of a supervised learning, which is a gradient descent-based optimization algorithm.
Evolutionary Algorithms Review EA’s refer to a class of population-based stochastic search algorithms that are inspired by concepts of natural evolution. EA’s is a package containing: • Evolution Strategies (ES) • Evolutionary Programming (EP) • Genetic Algorithms (GA)
Evolutionary Algorithms Review (contd.) EA’s apply well to complex problems which generate local optima. EA’s are not as susceptible to local optima then traditional gradient-based search algorithms. EA’s do not depend on gradient information and thus are quite suitable for problems where such information is unavailable or very costly to obtain or estimate. Evolutionary algorithms (Fogel 1995) is a kind of global optimization techniques that use selection and recombination as their primary operators to tackle optimization problems.
Evolution in EANN’s Evolution has been used with Artificial Neural Networks in: Connection Weights Architectures Learning Rules Connection Weights introduces an adaptive and global approach to training Architectures allow for topologies in ANN’s to adapt without human intervention Learning Rules is an adaptive process of automatic discovery of novel learning rules
The Evolution of Connection Weights Weight training in ANN’s is usually solved by minimization of an error function. Example of this is using mean square error between the desired outputs and actual outputs average over all examples , by iteratively adjusting the connection weights Training Algorithms, such as BP, have drawbacks due to the use of gradient Descent. It often gets caught in a local minimum and is incapable of finding a global minimum. To overcome gradient descent-based-training algorithms’ shortcomings is to adopt EANN’s.
The Evolution of Connection Weights (contd.) EA’s can then be used effectively in the evolution to find a near-optimal set of connection weights globally without computing gradient information The evolutionary approach to weight training consist of two main phases: Decide the representation of connection weights. Evolutionary process simulated by an EA. Binary Representation Real-Number Representation Changing the representation and search operators can lead to different training performance
The Evolution of Connection Weights (contd.) Below is the typical cycle of the evolution in connection weights. Decode each individual (genotype) in the current generation into a set of connection weights and construct a corresponding ANN with the weights. 1 Evaluate each ANN by computing its total mean square error between actual and target outputs. (other error functions can also be used.) The fitness of an individual is determined by the error. The higher the error, the lower the fitness. The optimal mapping from the error to the fitness is problem dependent. A regularization term may be included in the fitness function to penalize large weights. 2 3 Select parents for reproduction based on their fitness. Apply search operators, such as crossover and/or mutation, to parents to generate offspring, which form the next generation. 4 The evolution stops when the fitness is greater than a predefined value or the population has converged
Hybrid Training Most EA’s lack performance in fine-tuned local search, although they perform well at global search. Evolutionary training can be combined with a local search procedure to increase its efficiency . BP can be used for local search. Hybrid training has been successfully implemented in many application areas. Test have shown that hybrid GA/BP are more efficient then GA or BP standing alone.
The Evolution of Architectures Architecture design is an important aspect to the success of the ANN’s, because the architecture has significant impact on a network’s information processing capabilities. The architecture of ANN’s refers to its topological structure, such as its connectivity, and the transfer functions of each node in the ANN. Previous solutions to finding a near-optimal architecture for a given task consisted of empirically testing the hidden layers, nodes, and connections. Constructive Destructive Are algorithms that represent an effort toward the automatic design of architectures. A constructive algorithm starts with a minimal network, and adds new layers, nodes, and connections when necessary. A destructive algorithm starts with a maximal network, and deletes unnecessary layers, nodes, and connections. These type of techniques are prone to get trapped in a local optima.
The Evolution of Architectures (contd.) Characteristics which show EA’s are a better candidate for searching the surface then constructive, destructive algorithms. The surface is infinitely large since the number of possible nodes and connections is unbounded. 1 The surface is not differentiable since changes in the number of nodes or connections are discrete and can have a discontinuous effect on EANN’s performance 2 The surface is complex and noisy since the mapping from an architecture to its performance is indirect, strongly epistatic, and dependent on the evaluation method used 3 The surface is deceptive since similar architectures may have quite different performance; 4 The surface is multimodal since different architectures may have similar performance 5
The Evolution of Architectures (contd.) The evolutionary approach to Architectures consist of two main phases: Genotype representation scheme of architectures. Evolutionary Algorithms used to evolve ANN architectures. Types of Representation Schemes Direct Encoding Scheme: All the details (connections and node) of an architecture, can be specified by the chromosome. Indirect Encoding Scheme: Only the important aspects of the architecture, such as number of hidden layers and hidden nodes in each layer, are stored. The left over details are up to the training process to decide. Direct Encoding for feed-forward ANN
The Evolution of Architectures (contd.) Below is the typical cycle of the evolution in architectures. Decode each individual in the current generation into an architecture. If the indirect encoding scheme is used, further detail of the architecture is specified by some developmental rules or a training process. 1 Train each ANN with the decoded architecture by a predefined learning rule (some parameters of the learning rule could be evolved during training) starting from different sets of random initial connection weights and, if any, learning rule parameters. 2 3 Computer the fitness of each (encoded architecture) according to the above training result and other performance criteria such as the complexity of the architecture. 4 Select parents for reproduction based on their fitness. 5 Apply search operators to the parents and generate offspring which form the next generation. The cycle stops when a satisfactory ANN’s is found
The Evolution of Learning Rules Below is the typical cycle of the evolution in Learning Rules 1 Decode each individual in the current generation into a learning rule. Construct a set of ANN’s with randomly generated architectures and initial connections weights, and train them using the decoded learning rule. 2 Calculate the fitness of each individual (encoded learning rule) according to the average training result. 3 4 Select parents from the current generation according to their fitness Apply search operators to the parents and generate offspring which form the new generation. 5 The iteration stops when the population converges or a predefined maximum number of iterations has been reached.