80 likes | 231 Vues
Ratbert: Nearest Sequence Memory Based Prediction Model Applied to Robot Navigation. by Sergey Alexandrov iCML 2003. Defining the Problem. Choosing a navigational action (simplified world: left, right, forward, back)
E N D
Ratbert: Nearest Sequence Memory Based Prediction Model Applied to Robot Navigation by Sergey Alexandrov iCML 2003
Defining the Problem • Choosing a navigational action (simplified world: left, right, forward, back) • Consequence of action unknown given the immediate state (expected observation) • How to learn an unknown environment enough to accurately predict such consequences? Approaches • Learning the entire model (POMDP) – for example, Baum-Welch (problem: slow) • Goal-finding tasks – learning a path to a specific state (reinforcement problem) – for example, NSM (Nearest Sequence Memory) • Generalized observation prediction – NSMP (Nearest Sequence Memory Predictor)
NSMP in Short • Experience Seqn = {(o1,a1)…(on,an)} • NSMP(Seqn) = observation predicted by executing an • Derived by examining k nearest matches (NNS) ? ai oi oi+1 o1 o2 Example (k=4): o3 o2
NSMP in Short (Cont.) • Based on kNN applied to sequences of previous experience (NSM) • Find k nearest (here: longest) sequence matches to immediately prior experience • Calculate weights for each observation reached by the k sequence sections (tradeoff between long matches, and high frequency of matches) • Probability of each observation = normalized weight • Predicted observation is the observation with the highest probability
Testing • Ratbert: Lego-based robot capable of simple navigation inside a small maze. Senses walls in front, left, right, and noisy distance. • Software simulation based on Ratbert’s sensor inputs (larger environment, greater # of runs, longer sequences) • Actions: {left, right, forward, back} Observations: {left, right, front, distance} • For both trials, a training sequence was collected via random exploration, then a testing sequence was executed, comparing the predicted observation with the actual observation. For both, k was set to 4. • Results compared to bigrams.
Results • Plot: prediction rate vs. training sequence length. • First graph is for Ratbert, second graph is for the software simulation. • NSMP consistently produced a better, although not optimal, prediction rate.
Further Work • Comparison to other probabilistic predictive models • Determine optimal exploration method • Examine situations that trip up the algorithm • Go beyond “gridworld” concepts of left/right/forward/back to more realistic navigation • Work on mapping real sensor data to discrete classes required by instance-based algorithms such as NSM/NSMP (for example, using single linkage hierarchical clustering until cluster distance <= sensor error)