1 / 15

Transfer Learning with Inter-Task Mappings

Transfer Learning with Inter-Task Mappings. Matthew E. Taylor Joint work with Peter Stone Department of Computer Sciences The University of Texas at Austin. Transfer Motivation. Learning tabula rasa can be unnecessarily slow Humans can use information from previous tasks

memanuel
Télécharger la présentation

Transfer Learning with Inter-Task Mappings

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Transfer Learning with Inter-Task Mappings Matthew E. Taylor Joint work with Peter Stone Department of Computer Sciences The University of Texas at Austin

  2. Transfer Motivation • Learning tabula rasa can be unnecessarily slow • Humans can use information from previous tasks • Soccer with different numbers of players • Agents: leverage learned knowledge in novel/modified tasks • Learn faster • Larger and more complex problems become tractable • Different numbers of state variables and actions in tasks

  3. Common TL Metrics Also: total reward accumulated

  4. Transfer Goals • Autonomous transfer • AI Goal • Explore the world, learning • Transfer autonomously • Utilize past knowledge • Learn difficult tasks faster • Engineering Goal • Learn a set of simple tasks • Eventually learn target task • Total time reduction

  5. ρ Transfer via Inter-Task Mappings Source Task πnot defined for S’ and A’ ρ is a transfer functional task-dependant: relies on inter-task mappings π(S) → A π’(S’) → A’ Target Task

  6. Inter-Task Mappings • χA: atarget → asource Given target task action, return similar source task action • χX: starget → ssource Similar, but for state variables: for all x in each target task state: s = ⟨x1, x2, … xn⟩ • ρ automatically formed from χAand χX to enable transfer of: • π(s) • Q(s, a) • Rules • Model • etc.

  7. Transfer Functional: ρCMAC New states and actions in target task → new tiles Source Target • Counterintuitive: • Q-Values are very low-level • Very task-specific

  8. Sample Results • Can significantly reduce target task time and total time • Able to learn inter-task mappings with little data Keepaway Transfer: 3 vs. 2 to 4 vs. 3 Source Task Time Target Task Time Source Task Episodes

  9. Empirical Domains • Robot Soccer Keepaway • Server Job Scheduling • Mountain Car • Killer Application? • Epilepsy? • Robotics?

  10. Open Questions: 1/3 • Optimize for Total Time? Source Task Time Target Task Time Source Task Episodes

  11. Open Questions: 2/3 • Guarantee transfer efficacy? • Avoid Negative Transfer (“Giveaway”)? • Similarity measure? • Jumpstart in Target • MDP similarity [Ferns, others] • Analysis of learned source task knowledge

  12. Open Questions: 3/3 • Learn an inter-task mapping efficiently? • Sample complexity • Computational complexity • Select Source Task? • In library (sunk cost) • To learn first (total time metric)

  13. MASTER OverviewModeling Approximate State Transitions by Exploiting Regression Record observed (ssource, asource, s’source) tuples in source task Record small number of (starget, atarget, s’target) tuples in target task Learn one-step transition model, T(S,A), for the target task: M(starget, atarget) →s’target for every possible action mapping χA for every possible state variable mapping χX Transform recorded source task tuples Calculate the error of the transformed source task tuples on the target task model: ∑(M(stransformed, atransformed) – s’ transformed)2 returnχA,χX with lowest error

  14. Utilizing Mappings in 3D Mountain Car

More Related