Learning Modal Continuous Models

Learning Modal Continuous Models Joseph Xu Soar Workshop 2012

Setting: Continuous Environment • Input to the agent is a set of objects with continuous properties • Position, rotation, scaling, ... • Output is fixed-length vector of continuous numbers • Agent runs in lock-step with environment • Fully observable Environment Agent B Output Input -9.0 A 5.8 A B 0.2 1.2 0.0 0.0 0.0 0.2 3.4 3.9 0.0 0.0 0.0 0.0 rx rx px px py pz py pz ry rz ry rz

Levels of Problem Solving Characteristics Problem Solving Method Knowledge Required Faster Task Completion General Solutions Symbolic Model Symbolic Planning Symbolic Model Free Methods (RL) Symbolic Abstraction Continuous Sampling Methods (RRT) Continuous Model None Slower Task Completion Specific Solutions Motor Babbling Goal Recognition

Continuous Model Learning • Learn a function • x: current continuous state vector • u: current output vector • y: state vector in next time step Continuous Output X U Y x u y

Locally Weighted Regression ? Motor Command left voltage: -0.6 right voltage: 1.2 x u Weighted Linear Regression k nearest neighbors

Problems with LWR Query • Euclidean distance doesn’t capture relational similarity • Averages over neighbors exhibiting different types of interactions Neighbor Neighbor Neighbor Neighbor

Problems with LWR Query • Euclidean distance doesn’t capture relational similarity • Averages over neighbors exhibiting different types of interactions Neighbor Neighbor Prediction

Modal Models • Object behavior can be categorized into different Modes • Behavior within a single mode is usually simple and smooth (inertia, gravity, etc...) • Behaviors across modes can be discontinuous and complex (collisions, drops) • Modes can often be distinguished by discrete spatial relationships between objects • Learn two-level models composed of: • A classifier that determines the active mode using spatial relationships • A set of linear functions (initial hypothesis), one for each model Mode 1 model Mode Classifier Scene Prediction Mode 2 model Mode 3 model

Unsupervised Learning of Modes From Data Continuous Features Training Data 0.5, 1.1, -0.2, 4, 17 21.9 Environment Expectation Maximization Mode 1 Learned Mode 1 Mode 2 Learned Mode 2 time

Expectation Maximization • Expectation Assuming your current model parameters are correct, what is the likelihood that the model m generated data point i? • Maximization Assuming each data point was generated by the most probable model, modify each model’s parameters to maximize likelihood of generating data • Iterate until convergence to local maximum

Learning Classifier Scene Spatial Relations Training Data 0.5, 1.1, -0.2, 4, 17 21.9 Expectation Maximization attributes class B A 1000101011011 1 0101011010100 1 Learned Mode 1 left-of(A,B) = 1 right-of(A,B) = 0 on-top(A,B) = 0 touch(A,B) = 0 1100101100000 1 1010111010100 1 0010100010101 1 1110100010100 2 1000101011011 0001010100111 Learned Mode 2 2 1111010101010 2 1010100001001 2 1010101010011 1 0100110010101 1 time

Learning Classifier Classifier Training Data touch(A, B) attributes class 1000101011011 1 1 0101011010100 1 0 1 1100101100000 1010111010100 1 left-of(A, B) mode 2 0010100010101 1 2 1110100010100 1 0 2 0001010100111 2 1111010101010 mode 2 mode 1 1010100001001 2 1 1010101010011 1 0100110010101 Use linear model for items in same model

Prediction Accuracy Experiment • 2 Block Environment • Agent has two outputs (dx, dy) which control the x and y offsets of the controlled block at every times tep • The pushed block can’t be moved except by pushing it with the controlled block • Blocks are always axis-aligned, there’s no momentum • Training • Instantiate Soar agent in a variety of spatial configurations • Run 10 time steps, each step is a training example • Testing • Instantiate Soar agent in some configuration • Check accuracy of prediction for next time step

Prediction Accuracy – Pushed Block

Classification Performance

Prediction Performance Without Classification Errors

Levels of Problem Solving Characteristics Problem Solving Method Knowledge Required Faster Task Completion General Solutions Symbolic Model Symbolic Planning Symbolic Model Free Methods (RL) Symbolic Abstraction Continuous Sampling Methods (RRT) Continuous Model None Slower Task Completion Specific Solutions Motor Babbling Goal Recognition

Symbolic Abstraction • Lump continuous states sharing symbolic properties into a single symbolic state • Should be Predictable • Planning requires accurate model (ex. STRIPS operators) • Tends to require more states, more symbolic properties • Should be General • Fast planning and transferrable solutions • Tends to require fewer states, fewer symbolic properties C1 C1 C1 S1 C1 S1: intersect(C1, C2) S2: ~intersect(C1, C2) C2 C1 C1 C1 S2 C1 C1 C1 C1

Symbolic Abstraction • Hypothesis: contiguous regions of continuous space that share a single behavioral mode is a good abstract state • Planning within modes is simple because of linear behavior • Combinatorial search occurs at symbolic level • Spatial predicates used in continuous model decision tree are a reasonable approximation

Abstraction Experiment • 3 blocks, goal is to push c2 to t • Demonstrate a solution trace to agent • Agent stores sequence of abstract states in solution in epmem • Agent tries to follow plan in analogous task • Abstraction should include predicates about c1, c2, t, avoid predicates about d1, d2, d3 t d2 C1 C2 d1 t C1 C2 d3 C2 d3 C2 C1 C1 d1 d2 C1 C1

Generalization Performance 80 Tasks Total (16 average)

Conclusions • For continuous environments with interacting objects, modal models are more general and accurate than uniform model • The relationships that distinguish between modes serve as useful symbolic abstraction over continuous state • All this work takes Soar toward being able to autonomously learn and improve behavior in continuous environments

Evaluation Coal Nuggets Modal model learning is more accurate and general than uniform models Abstraction learning results are promising, but preliminary • Scaling issues: linear regression is exponential in number of objects • Linear modes is insufficient for more complex physics such as bouncing -> catastrophic failure

Learning Modal Continuous Models

Learning Modal Continuous Models

Presentation Transcript

Cognitive Learning Models

Discrete Choice Models for Modal Split

Continuous Improvement Through Continuous Learning

Continuous and Combined Discrete/ Continuous Models

Continuous Adult Learning

Continuous Models

Continuous and Combined Discrete/ Continuous Models

Continuous Learning

Multi Modal Learning Environment Project

Optimizing Models Using Continuous Ant Algorithms

Logistics Continuous Learning Modules

Adaptive Prognostic Models: Learning by Continuous Monitoring

Learning Models

Multivariable regression models with continuous covariates

DAU Continuous Learning Center

Cooperative Learning Models

CONTINUOUS LEARNING POINTS (CLPs)

Continuous and Combined Discrete/ Continuous Models

Graphical Models - Learning -

Learning Models