Machine Learning in Autonomous Agent Design

Machine learning: building agents that are capable to learn from their own experience An autonomous agent is expected to learn from its own experience, not just to utilize knowledge built-in by the agent’s designer. There are at least two reasons why this is important: • In complex environments, the agent may encounter situations which are not reflected in its knowledge base. • In dynamic environments, the world evolves over time. The agent must be able to revise its internal model of the world to reflect the changing world. Note, however, that “... you cannot learn anything unless you almost know it already” (Martin’s law, formulated by William Martin in 1979). We distinguish two kinds of machine learning depending on whether the goal is to “learn” new knowledge or to “update” the existing knowledge. Let us refer to the first kind of ML as data mining; it is based on digging useful descriptions out of data and formalizing it into an appropriate representation. The second kind of ML is based on acquiring new information and fitting it into the current knowledge base; this is referred to as knowledge refinement.

Data mining methods • Learning by recording cases (learning by analogy). Situations are recorded as-is, doing nothing to the information in those situations until they are used. When a new situation is encountered and you have to guess a property of an object, given nothing else but a set of initial situations find the most similar one and assume that the unknown property is the same as in the reference situation. • Learning by building identification trees (learning via rule induction). By looking for regularities in the data, we can build identification trees which are then used to classify unknown information. • Learning by training neural networks. In neural nets, neuronlike elements are arranged in nets, which are then used to recognize instances of patterns. The procedure used to train the net is called back-propagation. This procedure alters the effect of one simulated neuron on another in order to improve overall performance. • Learning by training perceptrons. Perceptrons are special kinds of neural nets, which so simple that can be viewed of just composed by one neuronlike element. By means of the so-called convergence procedure the performance of the perceptron can be improved in order to correctly classify objects. • Learning by simulating evolution. The so-called generic algorithms are based on ideas analogous to individuals, chromosome crossover, gene mutation, natural selection, etc. and are intended to simulate certain characteristics of heredity and evolution.

Knowledge refinement methods • Learning by analyzing differences (learning from observations). This is based on analyzing the differences that appear in a sequence of observations (positive and negative examples). The goal is to learn to correctly recognize members of a given class. The learning process starts by declaring the initial example to be the model, and then this initial model is incrementally improved using a series of examples. Negative examples are important in order to specialize the model, while positive examples allow to generalize the model to recognize all members of the class. • Learning by explaining experience (explanation-based learning). Explanations from causal chains are put together in a new “simpler” dependency, which next time can be directly applied to the similar initial situation. • Learning by managing multiple models. This method utilizes positive and negative examples to create a version space where it is possible to determine what it takes to be a member of a class. • Learning by correcting mistakes (knowledge revision). When an error is identified, the system tries to reveal the culprit for that error by analyzing the problem solving process and building an explaining why an error has occurred. Next, the system uses the explanation to revise the model in order to get rid of the error.

Learning via rule induction: an example Consider the classification task of recognizing different types of aircrafts based on their characteristics, and assume that an appropriate and sufficient set of test cases (i.e. examples of correct classifications) is available. Assume also that there are four classes of aircrafts: C130, C141, C5A and B747. Our task is to correctly classify an unknown object as a member of one of these four classes. Step 1 Identify the classes. Here we have four classes of aircrafts: C130, C141, C5A and B747. Step 2 Identify the attributes of class members. • Number of engines (2, 3, 4) • Type of engines (jet, propeller) • Wing position (high, low) • Wing shape (swept back, conventional) • Tail shape (T-shaped, conventional) • Bulges on the fuselage (aft of the cockpit, aft of the wing, under the wing, none) • size and dimensions • color and markings these attributes can be ignored • speed and altitude

Example (cont.) The rule induction process consists of the following steps: • Building a table describing objects, selected attributes and their values. object C130 C141 C5A B747 attribute engine type Prop Jet Jet Jet wing position high high high low wing shape conventional swept-back swept-back swept-back tail conventional T-tail T-tail conventional bulges under wings aft wings none aft cockpit

Building the decision tree, where each node is either a question about the value of a given attribute, or a conclusion. Edges coming out of the question nodes represent one of the possible values of the attribute. Let us choose (arbitrary) the root node of the tree to be the engine type: engine type Jet Prop wing shape C130 swept-back conventional wing position ? low high B747 tail shape conventional T-tail ? bulges none aft wing aft cockpit under wing C5A C141 ? ?

Decision trees are not unique; we may have alternative trees by reordering the nodes. This way we can eliminate nodes that lead to impossible conclusions. The following is an alternative tree: engine type Jet Prop wing position C130 low high B747 bulges none aft wing C5AC141

3 Generating rules from trees by means of the following algorithm: A. Identify a conclusion node that has not yet been dealt with. B. Trace the path from the conclusion node backward to the root node. C. The conclusion forms the “then” part of the rule, and the rest of the nodes along a given path form the “if” part of the rule. D. Repeat this process for each conclusion node. The following rules will be acquired from the later decision tree: Rule1: If (engine-type = prop) Then (plane = C130) Rule 2: If (engine-type = jet) (wing-position = low) Then (plane = B747) Rule3: If (engine-type = jet) (wing-position = high) (bulges = none) Then (plane = C5A) Rule 4: If (engine-type = jet) (wing-position = high) (bulges = aft of wing) Then (plane = C130)

Note that this rule base is not the most efficient one. We can have a more efficient set of rules (efficiency is measured here by the volume of data the system needs in order to correctly classify an object) provided the following tree: bulges none aft wings aft cockpit under wings C5A C141 B747 C130 The corresponding rule base is the following: Rule 1: If (bulges = none) Rule 2: If (bulges = aft-of-wings) Then (plane = C5A) Then (plane = C141) Rule 3: If (bulges = aft-of-cockpit) Rule 4: If (bulges = under-wings) Then (plane = B747) Then (plane = C130)

The ID3 algorithm for rule generation. Note that learning based on decision trees is very limited; it can only be applied in very simple, completely specified world. The ID3 algorithm is an extension of decision tree method which provides a more systematic way to acquire rules from test cases. Example: Assume you want to build a KBS advising about market investments based on a set of historic cases. Assume also that investment opportunities are limited to: • Investment in blue chip stocks. • Investment in North American gold mining stocks. • Investment in mortgage-related securities. The system must determine the most successful investment for a given set of conditions. Step 1: Identification of a set of attributes • interest rates • amount of cash available in Japan, Europe and the U.S. • the degree of international tension.

Step 2: Given historical data, build a table representing these cases Fund type Interest rates Cash available Tension Fund value case 1 Blue chip stocks high high medium medium case 2 Blue chip stocks low high medium high case 3 Blue chip stocks medium low high low case 4 Gold stocks high high medium high case 5 Gold stocks low high medium medium case 6 Gold stocks medium low high medium case 7 Mortgage-related high high medium low case 8 Mortgage-related low high medium high case 9 Mortgage-related medium low high low attributes classes

Example (cont.) 3. Build a decision tree based on the measure of the entropy of each attribute. The entropy is a measure of uncertainty of a given attribute: the higher the entropy, the higher the uncertainty of its values. (Please refer to the handouts distributed in class) 4. Acquiring the rules from the resulting tree. Rule 1: If (interest-rates = high) (fund-type = blue-chip) Then: (fund-value = medium) Rule 2: If (interest-rates = high) (fund-type = gold-stocks) Then: (fund-value = high) Rule 3: If (interest-rates = high) (fund-type = mortgage-related) Then: (fund-value = low)

Example (cont.) Rule 4: If (interest-rates = medium) (fund-type = blue-chip) Then: (fund-value = low) Rule 5: If (interest-rates = medium) (fund-type = gold-stocks) Then: (fund-value = medium) Rule 6: If (interest-rates = medium) (fund-type = mortgage-related) Then: (fund-value = low) Rule 7: If (interest-rates = low) (fund-type = blue-chip) Then: (fund-value = high) Rule 8: If (interest-rates = low) (fund-type = gold-stocks) Then: (fund-value = medium) Rule 9: If (interest-rates = low) (fund-type = mortgage-related) Then: (fund-value = high)

Problems with rule induction methods based on test cases 1 The quality of the rule base depends on the quality of test cases. Note that the number of test cases is not a criterion, because some of them may describe similar situations. 2 There may be conflicts between test cases, which means that additional attributes may need to be considered. 3 For large domains, this approach will result in huge trees, and thus in unefficient rule bases. 4 This approach develops “flat rules”, i.e. each rule results in a final conclusion.

Machine Learning in Autonomous Agent Design

Machine Learning in Autonomous Agent Design

Presentation Transcript

IT/CS 811 Principles of Machine Learning and Inference

The Experience Machine Revisited

Learning Agents Laboratory Computer Science Department George Mason University

IT/CS 811 Principles of Machine Learning and Inference

CHAPTER 1: Introduction

More Machine Learning

Machine Learning Unit 5, Introduction to Artificial Intelligence, Stanford online course

Machine learning

Statistical Learning Methods

Learning

CHAPTER 1: Introduction

IT/CS 811 Principles of Machine Learning and Inference

Lecture 7 : Intro to Machine Learning

Kernel Methods: the Emergence of a Well-founded Machine Learning

Chapter 8 Machine learning

Learning Agents Laboratory Computer Science Department George Mason University

Machine Learning and ILP for Multi-Agent Systems

The Experience Machine is Dead, Long Live the Experience Machine!

Machine Learning and Motion Planning

About Machine Learning Examples

Machine Learning Training

machine learning certification training in hyderabad