1 / 129

Parameter Learning

Parameter Learning. Announcements. Midterm 24 th 7-9pm, NVIDIA. Midterm review in class next Tuesday. Extra study material for midterm (after class). Homework back. Regrade process. Looking into reflex agent on pacman. Some changes to the schedule.

judson
Télécharger la présentation

Parameter Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parameter Learning

  2. Announcements • Midterm 24th 7-9pm, NVIDIA • Midterm review in class next Tuesday • Extra study material for midterm (after class). • Homework back • Regrade process • Looking into reflex agent on pacman • Some changes to the schedule • Want to hear your song before class?

  3. Happy with how you did

  4. > 23 / 20 > 20 / 20 >= 17 / 20 >= 15 / 20 >= 12 / 20 >= 9 / 20 >= 2 / 20 Pac Man Grades CS221 Grade Book 16% A lot of class left

  5. Yay Good Good Good Ok? Talk Talk How we see it CS221 Grade Book 16% A lot of class left

  6. Yay Good Good Good Ok? Talk Talk Good job! CS221 Grade Book 16% A lot of class left

  7. Yay Good Good Good Ok? Talk Talk Alright CS221 Grade Book 16% A lot of class left

  8. Yay Good Good Good Ok? Talk Talk Rethink CS221 Grade Book 16% A lot of class left

  9. Theory on Grades

  10. Common Error: Formalize a problem Real World Problem Model the problem Formal Problem Apply an Algorithm Evaluate Solution

  11. Modeling Discrete Search : what makes a state : possible actions from state s Succ: states that could result from taking action a from state s : reward for taking action a from state s : starting state :whether to stop :the value of reaching a given stopping point

  12. Modeling Markov Decision : what makes a state : possible actions from state s : probability distribution of states that could result from taking action a from state s : reward for taking action a from state s : starting state :whether to stop :the value of reaching a given stopping point

  13. Modeling Bayes Net Definition: Bayes Net = DAG DAG: directed acyclic graph (BN’s structure) • Nodes: random variables (typically discrete, but methods also exist to handle continuous variables) • Arcs: indicate probabilistic dependencies between nodes. Go from cause to effect. • CPDs: conditional probability distribution (BN’s parameters) Conditional probabilities at each node, usually stored as a table (conditional probability table, or CPT) Root nodes are a special case – no parents, so just use priors in CPD:

  14. Modeling Hidden Markov Model X1 X2 X3 X4 X5 Formally: (1) State variables and their domains (2) Evidence variables and their domains (3) Probability of states at time 0 (4) Transition probability (5) Emission probability E1 E2 E3 E4 E5

  15. Formally, we want to get our model inside the python

  16. Scary?

  17. Theory on Grades

  18. Previously on CS221 In Class Research

  19. Previously on CS221 In Class Research

  20. Previously on CS221

  21. Hidden Markov Model X1 X2 X3 X4 X5 Formally: (1) State variables and their domains (2) Evidence variables and their domains (3) Probability of states at time 0 (4) Transition probability (5) Emission probability E1 E2 E3 E4 E5

  22. Filtering X1 X1 X2 E1

  23. Tracking Other Cars

  24. Track a Car! Pos2 Pos1 Dist1 Dist2

  25. Track a Robot! Pos1 Probability Density Dist1 Value of d

  26. Track a Robot! μ = True distance from x to your car Pos1 Probability Density Dist1 Value of d

  27. Track a Robot! μ = True distance from x to your car Pos1 σ = Const.SONAR_STD Probability Density Dist1 Value of d

  28. Track a Robot! Pos2 Pos1

  29. Particle Filters A particle is a hypothetical instantiation of a variable. Store a large number of particles. Elapse time by moving each particle given transition probabilities. When we get new evidence we weight each particle and create a new generation. The density of particles for any given value is an approximation of the probability that our variable equals that value

  30. 0.0 0.1 0.0 0.0 0.0 0.2 0.0 0.2 0.5 Particle Filtering Sometimes |X| is too big to use exact inference • |X| may be too big to even store B(X) • E.g. X is continuous • E.g. X is a real world map Solution: approximate inference • Track samples of X, not all values • Samples are called particles • Time per step is linear in the number of samples • But: number needed may be large • In memory: list of particles, not states This is how robot localization works in practice

  31. Elapse Time Each particle is moved by sampling its next position from the transition model • Reflect the transition probs • Here, most samples move clockwise, but some move in another direction or stay in place This captures the passage of time • If we have enough samples, close to the exact values before and after (consistent)

  32. Observe Step Slightly trickier: • We downweight our samples based on the evidence • Note that, as before, the probabilities don’t sum to one, since most have been downweighted (in fact they sum to an approximation of P(e))

  33. Resample Old Particles: (3,3) w=0.1 (2,1) w=0.9 (2,1) w=0.9 (3,1) w=0.4 (3,2) w=0.3 (2,2) w=0.4 (1,1) w=0.4 (3,1) w=0.4 (2,1) w=0.9 (3,2) w=0.3 Rather than tracking weighted samples, we resample N times, we choose from our weighted sample distribution (i.e. draw with replacement) This is analogous to renormalizing the distribution Now the update is complete for this time step, continue with the next one New Particles: (2,1) w=1 (2,1) w=1 (2,1) w=1 (3,2) w=1 (2,2) w=1 (2,1) w=1 (1,1) w=1 (3,1) w=1 (2,1) w=1 (1,1) w=1

  34. Track a Robot! Pos2 Pos1 Walls1 Walls2 Sometimes sensors are wrong Sometimes motors don’t work

  35. Transition Prob Start

  36. Emission Prob Laser sensor Sense walls

  37. Original Particles

  38. Observation

  39. Reweight…

  40. Resample + Pass Time

  41. Observation

  42. Reweight…

  43. Resample

  44. Pass Time

  45. Observation

  46. Reweight…

  47. Resample

  48. Pass Time

  49. Observation

  50. Reweight…

More Related