1 / 16

Learning Reactive Behavior in Autonomous Vehicles: SAMUEL

Learning Reactive Behavior in Autonomous Vehicles: SAMUEL. Sanaa Kamari. SAMUEL. Computer system that learns reactive behavior for autonomous vehicles. Reactive behavior is the set of actions taken by an AV as a reaction to sensor readings.

davin
Télécharger la présentation

Learning Reactive Behavior in Autonomous Vehicles: SAMUEL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning Reactive Behavior in Autonomous Vehicles:SAMUEL • Sanaa Kamari

  2. SAMUEL • Computer system that learns reactive behavior for autonomous vehicles. • Reactive behavior is the set of actions taken by an AV as a reaction to sensor readings. • uses Genetic algorithm to improve decision making rules. • Each individual in SAMUEL is an entire rule set or strategy.

  3. Motivation for SAMUEL • Learning facilitates extraction of rules from the expert. • Rules are context based => impossible to account for every situation. • Given a set of conditions, the system is able to learn the rules of operation from observing and recording his own actions. • Samuel uses a simulation environment to learn.

  4. SAMUEL • Problem specific module. • The world model and its interface. • Set of internal and external sensors • Controllers that control the AV simulator • Critic component that criticizes the success or failure of the AV. [1]

  5. SAMUEL (cont) Performance module • Matches the rules. • Performs conflict resolution. • Assign some strength values to the rules. • Learning module. • Uses GA to develop reactive behavior, as a set of condition-reaction rules. • GA searches for the behavior to exhibit the best performance • Behaviors are evaluated in real world model. • Behaviors are selected for duplication and modification. [1]

  6. Experiment Domain: Autonomous Underwater Vehicle navigation and collision avoidance • Training the AUV simulator by virtually positioning it in the center of a field with 25 mines, and an objective outside the field. • 2D AUV must navigate through a dense mine field toward a stationary object. • AUV Actions: set speed and direction each decision cycle. • System does not learn path, but a set of rules that reactively decide a move at each step.

  7. Experiment Results • Great improvement in both static and moving mines. • SAMUEL shows that reactive behavior can be learned. [1]

  8. Domain: ROBOT Continuous and embedded learning • To create Autonomous systems that continue to learn throughout their lives. • To adapt a robot’s behavior in response to changes in its operating environment and capabilities. • experiment: robot learns to adapt to failure in its sonar sensors.

  9. Continuous and Embedded learning Model • Execution module: controls the robot’s interaction with its environment. • Learning module: continuously tests new strategies for the robot against a simulation model of its environment. [2]

  10. Execution Model • Includes a rule-based system that operates on reactive (stimulus-response) rules. • IF range = [35, 45] AND front sonar < 20 AND right sonar > 50 THEN SET turn = -24 (Strength 0.8) • Monitor: Identifies symptoms of sonar failure. • measures output of sonar, compare it to recent readings and direction of motion. • Modifies simulation used by learning sys to replicate failure.

  11. Learning Module • Uses SAMUEL: uses Genetic algorithm to improve decision making rules.

  12. Experiment • Task requires Robot to go from one side of a room to the other through an opening. • Robot placed randomly 4 ft from back wall. • Location of opening is random. • Center of front wall is 12.5ft from back wall

  13. Experiment (cont) • Robot begins with a set of default rules for moving toward the goal. • Learning starts with simulation that includes and all sonars working. • After an initial period one ore more sonars are blinded. • Monitor detects failed sonars, learning simulation is adjusted to reflect failure. • Population of competing strategies is re-initialized and learning continues. • The online Robot uses the best rules discovered by the learning system since the last change to the learning simulation model,

  14. Experiment Results • Robot in motion with all sensors intact: • a) during run and b) at goal. • Robot in motion after adapting to loss of three sensors: front, front right and right: • a) during run, and b) at goal. [2]

  15. Experiment Results [2] • a) Robot with full sensors passing directly through doorway. • b) Robot with front sonar covered. • c) Robot after adapting to covered sonar. It uses side sonar to find opening, and then turns into the opening.

  16. References • [1]. A. C. Schultz and J. J.Grefenstetts, “Using a genetic algorithm to learn reactive behavior for autonomous vehicles,” in Proceedings of the AIAA Guidance, Navigation, and Control Conference, (Hilton Head, SC), 1992. • [2]. A. C. Schultz and J. J.Grefenstetts, ”Continuous and Embedded Learning in Autonomous Vehicles: Adapting to Sensor Failures”, in Proceeding of SPIE vol. 4024, pg 55-62, 2000.

More Related