210 likes | 306 Vues
Explore the efficient, automated model-based learning methods for visual navigation, including Model-Free and Performance Phases, Automatic Feature Derivation, Principal Component Analysis, and Markov Models. Learn about incremental learning, local attention, and comparison with neural networks.
E N D
Visual Learning with Navigation as an Example Dr JuyangWeeng Dr Shaoyun Chen Michigan Sate University
Model Based Methods • PROS • Efficient for predictable cases • Easier to understand • Computationally inexpensive • CONS • Non generic • Not able to deal with every possible case • Potentially huge number of exhaustive cases.
MODEL FREE METHODS • Automatically learn the model Xt input image in rc dimensional space(S) Yt+1 control signal in space C The image needs to be vectorized. GOAL :Approximate the function f Yt+1=f(Xt)
Recursion Partition Tree • Each leaf node represents sample(X,Y) • Each node represents a set of data points with increased similarity • One of the central ideas in Shoslif’s approach • Given X find f(X) at the corresponding leaf node after traversal.
Learning Phase • Building a Regression Partition Tree • Take the sample space S. • Divide the space into b cells. Each a child of the root. • The analysis performs automatic derivation of features(discussed later). • Continue to do this until the leaf nodes have a single data point or many data points with virtually the same Y.
How to construct the RPTLearning Phase 2 9 1 1 7 6 2 3 3 4 6 7 4 5 5 8 8 9
Performance phase • Input X’ • Output Y control signal • Recursively analyze the centre of each node • If it is close to the input then proceed in that direction till you reach the leaf node . • Use the corresponding Control signal • Use top k paths to find the top k nearest centers.
Automatic Feature Derivation • Feature Selection :Select features from a set of human defined features. • Feature Extraction: extrapolates selected features from images • Feature Derivation : derives features from high dimensional vector inputs • Using Principal Component Analysis recursively partitions the space S into a subspace S’ where the training samples lie.
PCA • Computes the principal component vectors . • V1,V2,V3,V4…..VN • MEF : Most Expressive Features • They explain the variation in the sample set • The hyper plane that has V1 as a normal an that passes through the centroid of the samples forms a partition. • The samples on one side fall onto on side of the tree and vice versa.
PCA v/s LDA [1] PCA LDA
LDA • We can do better with class information. • MDF :Most discriminating feature • Similar to PCA • This method is cuts more along the class boundaries. • Differences • MEF: samples spread out widely, and the samples of different classes tend to mix together. • MDF: samples are clustered more tightly, and the samples from different classes are farther apart.
Using States • Using a model similar to Markov chain model • St State at time t • At time t, the system is at state St and observes image Xt. • Control vector Yt+1 and enters the next state St+1. (St+1, Yt+1) = f (St, Xt)
Dealing with local attention • A special state A (ambiguous) indicates that local visual attention is needed. • Eg. trainer defined this state for a segment right before a turn. • If the image area that revealed the visual difference between different turn types was mainly in a small part of the scene. • A directs the system to look at such landmarks through a prespecified image sub window so that the system • issues the correct steering action before it is too late.
Incremental Learning • Batch learning : All the training data are available at the time the system learns. • Incremental learning :Training samples are available only one at a time. • Discard once you have used them • Memory requires to store the image only once. • Similar images discarded
Shoslif versus other methods • Compared Shoslif with feed forward neural networks and radial basis function networks for approximating stateless appearance-based navigation systems. • Shoslif did significantly better than both methods. • Extension to face detection, speech recognition and vision-based robot arm action learning.
Conclusion • Shoslif performs better in benign scenes. • The state based method allows more flexibility • However still need to specify that many states for different environment types.
References • 1. Dr.Juyang Weng & Dr. Shaouyun Chen “Visual Learning with Navigation as an Example” .Published in IEE September/October 2000.