Transfer Learning

Transfer Learning Lisa Torrey University of Wisconsin – Madison CS 540

Transfer Learning in Humans • Education • Hierarchical curriculum • Learning tasks share common stimulus-response elements • Abstract problem-solving • Learning tasks share general underlying principles • Multilingualism • Knowing one language affects learning in another • Transfer can be both positive and negative

Transfer Learning in AI Given Learn Task T Task S

Goals of Transfer Learning higher asymptote higher slope performance higher start training

Inductive Learning Search Allowed Hypotheses All Hypotheses

Transfer in Inductive Learning Search Allowed Hypotheses All Hypotheses Thrun and Mitchell 1995: Transfer slopes for gradient descent

Transfer in Inductive Learning Bayesian methods Bayesian Learning Bayesian Transfer Prior distribution + Data = Posterior Distribution Raina et al.2006: Transfer a Gaussian prior

Transfer in Inductive Learning Hierarchical methods Pipe Surface Circle Line Curve Stracuzzi2006: Learn Boolean concepts that can depend on each other

Transfer in Inductive Learning Dealing with Missing Data or Labels Task T Task S Shi et al. 2008: Transfer via active learning

Reinforcement Learning Agent Q(s1, a) = 0 π(s1) = a1 Q(s1, a1)  Q(s1, a1) + Δ π(s2) = a2 s2 s3 a1 a2 r2 r3 s1 Environment • δ(s2, a2) = s3 • r(s2, a2) = r3 • δ(s1, a1) = s2 • r(s1, a1) = r2

Transfer in Reinforcement Learning Starting-point methods Hierarchical methods Alteration methods New RL algorithms Imitation methods

Transfer in Reinforcement Learning Starting-point methods Initial Q-table transfer Source task no transfer target-task training Taylor et al. 2005: Value-function transfer

Transfer in Reinforcement Learning Hierarchical methods Soccer Pass Shoot Run Kick Mehta et al. 2008: Transfer a learned hierarchy

Transfer in Reinforcement Learning Alteration methods Task S Original states Original actions Original rewards New states New actions New rewards Walsh et al. 2006: Transfer aggregate states

Transfer in Reinforcement Learning New RL Algorithms Agent Q(s1, a) = 0 π(s1) = a1 Q(s1, a1)  Q(s1, a1) + Δ π(s2) = a2 a1 a2 s2 s3 s1 r2 r3 Environment • δ(s2, a2) = s3 • r(s2, a2) = r3 • δ(s1, a1) = s2 • r(s1, a1) = r2 Torrey et al. 2006: Transfer advice about skills

Transfer in Reinforcement Learning Imitation methods source policy used target Torrey et al. 2007: Demonstrate a strategy training

My Research Starting-point methods Hierarchical methods Hierarchical methods New RL algorithms Imitation methods Skill Transfer Macro Transfer

RoboCup Domain 3-on-2 KeepAway 3-on-2 BreakAway 2-on-1 BreakAway 3-on-2 MoveDownfield

Inductive Logic Programming IF [ ] THEN pass(Teammate) IF distance(Teammate) ≤ 5 THEN pass(Teammate) IF distance(Teammate) ≤ 10 THEN pass(Teammate) … IF distance(Teammate) ≤ 5 angle(Teammate, Opponent) ≥ 15 THEN pass(Teammate) IF distance(Teammate) ≤ 5 angle(Teammate, Opponent) ≥ 30 THEN pass(Teammate)

Advice Taking Batch Reinforcement Learning via Support Vector Regression (RL-SVR) Agent Agent Compute Q-functions … Environment Environment Batch 2 Batch 1 Find Q-functions that minimize: ModelSize + C × DataMisfit

Advice Taking Batch Reinforcement Learning with Advice (KBKR) Agent Agent Compute Q-functions … Environment Environment Advice Batch 1 Batch 2 + µ × AdviceMisfit Find Q-functions that minimize: ModelSize + C × DataMisfit

Skill Transfer Algorithm Source ILP IF distance(Teammate) ≤ 5 angle(Teammate, Opponent) ≥ 30 THEN pass(Teammate) Mapping Advice Taking Target [Human advice]

Selected Results Skill transfer to 3-on-2 BreakAway from several tasks

Macro-Operators pass(Teammate) move(Direction) IF [ ... ] THEN pass(Teammate) IF [ ... ] THEN move(ahead) IF [ ... ] THEN shoot(goalRight) IF [ ... ] THEN shoot(goalLeft) IF [ ... ] THEN pass(Teammate) IF [ ... ] THEN move(left) IF [ ... ] THEN shoot(goalRight) IF [ ... ] THEN shoot(goalRight) shoot(goalRight) shoot(goalLeft)

Demonstration An imitation method source policy used target training

Macro Transfer Algorithm Source ILP Demonstration Target

Macro Transfer Algorithm Learning structures Positive: BreakAway games that score Negative: BreakAway games that didn’t score ILP IF actionTaken(Game, StateA, pass(Teammate), StateB) actionTaken(Game, StateB, move(Direction), StateC) actionTaken(Game, StateC, shoot(goalRight), StateD) actionTaken(Game, StateD, shoot(goalLeft), StateE) THEN isaGoodGame(Game)

Macro Transfer Algorithm Learning rules for arcs Positive: states in good games that took the arc Negative: states in good games that could have taken the arc but didn’t ILP pass(Teammate) shoot(goalRight) IF [ … ] THEN loop(State, Teammate)) IF [ … ] THEN enter(State)

Selected Results Macro transfer to 3-on-2 BreakAway from 2-on-1 BreakAway

Summary • Machine learning is often designed in standalone tasks • Transfer is a natural learning ability that we would like to incorporate into machine learners • There are some successes, but challenges remain, like avoiding negative transfer and automating mapping

Transfer Learning

Transfer Learning

Presentation Transcript

issues in transfer learning

Transfer of Learning

Transfer of Learning

Transfer to Learning

TRANSFER OF LEARNING

Learning theory Transfer of Learning

Transfer for Supervised Learning Tasks

Transfer Learning Part I: Overview

Transfer of Learning

TRANSFER OF LEARNING

Towards Heterogeneous Transfer Learning

Transfer Learning Segment

Learning Transfer in Girls Volleyball

Transfer Learning for Image Classification

Assessing For Learning and Transfer

Dual Transfer Learning

Learning for Transfer

A Survey on Transfer Learning

Unsupervised and Transfer Learning Challenge

Cash Transfer Learning

Transfer of Learning