1 / 24

Learning Drivers through Imitation using Supervised Methods

By Luigi Cardamone , Daniele Loiacono and Pier Luca Lanzi. Learning Drivers through Imitation using Supervised Methods. The outline. Introduction Related work Torcs Imitation learning What sensors? What actions? What learning method? What data? Experimental results

muriel
Télécharger la présentation

Learning Drivers through Imitation using Supervised Methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. By Luigi Cardamone, Daniele Loiacono and Pier Luca Lanzi Learning Drivers through Imitation using Supervised Methods

  2. The outline • Introduction • Related work • Torcs • Imitation learning • What sensors? • What actions? • What learning method? • What data? • Experimental results • Discussion, conclusions and future work

  3. Introduction • What is imitation learning? • Supervised learning • Neuroevolution • Two main methods • Direct methods • Indirect methods

  4. Introduction • Direct methods are well-known to be very ineffective. • Our methods develop drivers with only 15% lower performance than best bot in TORCS. • The trick is in “human-like” high-level action prediction

  5. The outline • Introduction • Related work • Torcs • Imitation learning • What sensors? • What actions? • What learning method? • What data? • Experimental results • Discussion, conclusions and future work

  6. Related work • Imitation learning in computer games • Rule-based NPC for Quake III via two-step process • Quake II NPC via reinforcement learning, fuzzy clustering and a Bayesian motion-modeling. • Neural networks with backpropagation for Legion II and Motocross The Force. • Drivatar training for ForzaMotosport

  7. The outline • Introduction • Related work • Torcs • Imitation learning • What sensors? • What actions? • What learning method? • What data? • Experimental results • Discussion, conclusions and future work

  8. What input sensors? The rangefinder sensor The lookahead sensor

  9. What actions? • 4 low-level effectors in TORCS • Wheel • Gas pedal • Brake pedal • Gear change • 2 high-level actions in this work • Speed • Trajectory

  10. What learning methods? • K-nearest neighbor • Training : • Doesn’t need any training • How it was applied? • Directly during the TORCS race • At each tic, the logged data is searched to find the k most similar instances. • The k similar instances are selected and averaged

  11. What learning methods? • Neural networks • Training • Neuroevolution with Augmenting Topology (NEAT) to evolve both the weights and the topology of a neural network • How it was applied? • 2 networks, for speed and target position prediction • Rangefinder networks with 19 angle inputs + 1 bias input • Lookahead networks with 8 segments inputs + bias input • The fitness was defined as the prediction error

  12. What data? • Inferno bot on 3 tracks for 3 laps each • Simple fast track • Difficult track with many fast turns • A difficult track with many slow sharp turns • Only the data of second lap was recorded • 3 data sets with 1982, 3899 and 3619 examples • Additional all-in set with 9500 examples

  13. The outline • Introduction • Related work • Torcs • Imitation learning • What sensors? • What actions? • What learning method? • What data? • Experimental results • Discussion, conclusions and future work

  14. Experimental results • Overall, we obtained 16 models • 2 learning algorithms • 3 + 1 datasets • 2 types of sensors • K-nearest algorithm was applied with k = 20 • NEAT was applied with 100 individuals for 100 generations • All the experiments were conducted using TORCS 1.3.1

  15. Experimental Results - Evaluation • Each model was evaluated by using it to drive a car on each track for 10.000 game ticks. • The tracks • 3 tracks used for training • 2 unseen tracks • A simple fast track • A track with many fast and difficult turns • The driver was also equipped with standard recovery policy.

  16. Experimental results - Results • Inferno was better than its imitations • Lookeaheads are better than rangefinders • K-nearest neighbor is better than NEAT • One of the models had only 15% lower performance than Inferno bot.

  17. The summary of the results

  18. Experimental results - Execution time • Direct methods result in low computational cost • Our approach needs 30 times less CPU time to obtain reasonable results

  19. Increasing the lookahead • How much lookahead is useful? • Second series of tests with 8 and 16 lookeahead values showed overfitting

  20. The outline • Introduction • Related work • Torcs • Imitation learning • What sensors? • What actions? • What learning method? • What data? • Experimental results • Discussion, conclusions and future work

  21. Discussion • Good drivers • Close to the target bot • Run out of the track in difficult turns as a result of prediction error or a low reactivity in steering • Bad drivers • Many discontinues in the prediction of trajectories • Causes car to move quickly from one side of the track to the other one

  22. Perceptual aliasing • Two different places can be perceived the same • Usually happens on long straight parts of the road • Can be solved via special treatment of straight parts, full throttle or bigger lookahead

  23. Summary • Supervised learning to imitate a driver • High-level aspect of driving, speed and trajectory rather than low-level effectors • Novel lookahead sensor • Good results with k-nearest neighbor • Inferno bot is still better due to perceptual aliasing and slow steering during abrupt turns

  24. Future work • Exploit structural symmetry on the track • Increase the robustness to noise • Reduce computational cost • Improve steering reaction to abrupt turns

More Related