500 likes | 538 Vues
Explore the concept of an interaction plateau in tutoring systems through a dialogue between a human tutor and student, comparing different types of tutoring models and the effectiveness of each. The dialogue delves into understanding a problem step-by-step and addressing misconceptions. Major differences between low-interaction tutoring and natural tutoring are highlighted, along with conditions affecting effectiveness. Experiments with different student levels and instructional methods are analyzed to determine the most impactful tutoring approach.
 
                
                E N D
The interaction plateau CPI 494, April 9, 2009 Kurt VanLehn
Schematic of a natural language tutoring systems, AutoTutor Stepend T: Tell Stepstart Only if out of hints T: Elicit S: Correct Remediation: T: Hint or prompt S: Incorrect
Schematic of other natural language tutors, e.g., Atlas, Circsim-Tutor, Kermit-SE Stepend T: Tell Stepstart Only if out of hints T: Elicit S: Correct Remediation: T: What is…?S: I don’t know.T:Well, what is…S:…T:… S: Incorrect Often called a KCD: Knowledge construction dialogue
Hypothesized ranking of tutoring, most effective first • Expert human tutors • Ordinary human tutors • Natural language tutoring systems • Step-based tutoring systems • Answer-based tutoring systems • No tutoring
Hypothesized effect sizes Bloom’s (1984) 2-sigma: 4 weeks of human tutoring vs. classroom Classroom
Hypothesized effect sizes Kulik (1984) meta-analysis of CAI vs. classroom  0.4 sigma Classroom
Hypothesized effect sizes Many intelligent tutoring systems: e.g., Andes (VanLehn et al, 2005), Carnegie Learning’s tutors… Classroom
A problem and its steps • Suppose you are running in a straight line at constant speed. You throw a pumpkin straight up. Where will it land? • Initially, you and the pumpkin have the same horizontal velocity. • Your throw exerts a net force vertically on the pumpkin. • Thus causing a vertical acceleration. • Which leaves the horizontal velocity unaffected. • So when the pumpkin falls, it has traveled the same distance horizontally as you have. • Thus, it lands in your hands
A dialogue between a human tutor (T) and human student (S) • Suppose you are running in a straight line at constant speed. You throw a pumpkin straight up. Where will it land? S: Behind me. • T: Hmm. Let’s think about that. Before you toss the pumpkin and are just carrying it, do you and the pumpkin have the same speed?S: Yes • T: Good. When you toss it up, is the net force on it exactly vertical?S: I’m not sure.T: You exert a force on the pumpkin, right?Etc.
Schematic of dialogue about a single step Stepend T: Tell Stepstart T: Elicit S: Correct Remediation: T: Hint, or prompt, or explain, or analogy, or … S: Incorrect
Comparisons of expert to novice human tutors Stepend T: Tell Novices Stepstart Experts T: Elicit S: Correct T: Hint, or prompt, or explain, or analogy, or … S: Incorrect Experts may have a wider variety
Schematic of an ITS handling of a single step Stepend T: Tell Stepstart Only if out of hints S: Correct T: Hint S: Incorrect
Major differences • Low-interaction tutoring (e.g., CAI) • Remediation on answer only • Step-based interaction (e.g., ITS) • Remediation on each step • Hint sequence, with final “bottom out” hint • Natural tutoring (e.g., human tutoring) • Remediation on each step, substep, inference… • Natural language dialogues • Many tutorial tactics
Conditions(VanLehn, Graesser et al., 2007) • Natural tutoring • Expert Human tutors • Typed • Spoken • Natural language dialogue computer tutors • Why2-AutoTutor (Graesser et al.) • Why2-Atlas (VanLehn et al.) • Step-based interaction • Canned text remediation • Low interaction • Textbook
Human tutors(a form of natural tutoring) Stepend T: Tell Stepstart T: Elicit S: Correct T: Hint, or prompt, or explain, or analogy, or … S: Incorrect
Why2-Atlas(a form of natural tutoring) Stepend T: Tell Stepstart T: Elicit S: Correct A Knowledge Construction Dialogue S: Incorrect
Why2-AutoTutor(a form of natural tutoring) Stepend T: Tell Stepstart T: Elicit S: Correct Hint or prompt S: Incorrect
Canned-text remediation(a form of step-based interaction) Stepend T: Tell Stepstart T: Elicit S: Correct Text S: Incorrect
Experiment 1: Intermediate students & instruction No reliable differences
Experiment 2:AutoTutor > Textbook = Nothing Reliably different
Experiments 1 & 2(VanLehn, Graesser et al., 2007) No significant differences
Experiment 3: Intermediate students & instruction Deeper assessments
Experiment 3: Intermediate students & instruction No reliable differences
Relearning Experiment 4: Novice students & intermediate instruction
Experiment 4: Novice students & intermediate instruction All differences reliable
Relearning Experiment 5: Novice students & intermediate (but shorter) instruction Add Add
Experiment 5: Novice students & intermediate instruction No reliable differences
Experiment 5: Low-pretest students only Aptitude-treatment interaction?
Experiment 5, Low-pretest students only Spoken human tutoring > canned text remediation
Experiments 6 and 7 Novice students & novice instruction Was the intermediate text over the novice students’ heads?
Experiments 6 and 7 Novice students & novice instruction No reliable differences
Interpretation = Can follow reasoning only with tutor’s help (ZPD) predict: Tutoring > Canned text remediation = Can follow reasoning without any help predict: Tutoring = Canned text remediation Experiments 1 & 4 Content complexity Experiments 3 & 5 Experiments 6 & 7 High-pretest Low-pretest Intermediates High-pretest Low-pretest Novices
Original research questions • Can natural language tutorial dialog add pedagogical value? • Yes, when students must study content that is too complex to be understood by reading alone • How feasible is a deep linguistic tutoring system? • We built it. It’s fast enough to use. • Can deep linguistic and dialog techniques add pedagogical value?
When content is too complex to learn by reading alone: Deep>Shallow? Why2-Atlas is not clearly better than Why2-AutoTutor
When to use deep vs. shallow? Use both Use deep Use locally smart FSA Use equivalent texts
Results from all 7 experiments(VanLehn, Graesser et al., 2007) • Why2: Atlas = AutoTutor • Why2 > Textbook • No essays • Content differences • Human tutoring = Why2 = Canned text remediation • Except when novice students worked with instruction designed for intermediates, then Human tutoring > Canned text remediation
Other evidence for the interaction plateau (Evens & Michael, 2006) No significant differences
Other evidence for the interaction plateau (Reif & Scott, 1999) No significant differences
Other evidence for the interaction plateau (Chi, Roy & Hausmann, in press) No significant differences
Still more studies where natural tutoring = step-based interaction • Human tutors • Human tutoring = human tutoring with only content-free prompting for step remediation (Chi et al., 2001) • Human tutoring = canned text during post-practice remediation (Katz et al., 2003) • Socratic human tutoring = didactic human tutoring (Rosé et al., 2001a • Socratic human tutoring = didactic human tutoring (Johnson & Johnson, 1992) • Expert human tutoring = novice human tutoring (Chae, Kim & Glass, 2005) • Natural language tutoring systems • Andes-Atlas = Andes with canned text (Rosé et al, 2001b) • Kermit = Kermit with dialogue explanations (Weerasinghe & Mitrovic, 2006)
Hypothesis 1: Exactly how tutors remedy a step doesn’t matter much Stepend T: Tell Stepstart T: Elicit S: Correct What’s in here doesn’t matter much S: Incorrect
Main claim: There is an interaction plateau Hypothesis 1
Hypothesis 2: Cannot eliminate the step remediation loop Stepend T: Tell Stepstart Must avoid this T: Elicit S: Correct S: Incorrect
Main claim: There is an interaction plateau Hypothesis 2
Conclusions • What does it take to make computer tutors as effective as human tutors? • Step-based interaction • Bloom’s 2-sigma results may have been due to weak control conditions (classroom instruction) • Other evaluations have also used weak controls • When is natural language useful? • For steps themselves (vs. menus, algebra…) • NOT for feedback & hints (remeditation) on steps
Future directions for tutoring systems research • Making step-based instruction ubiquitous • Authoring & customizing • Novel task domains • Increasing engagement
Final thought • Many people “just know” that more interaction produces more learning. • “It ain’t so much the things we don’t know that get us into trouble. It’s the things we know that just ain’t so.” • Josh Billings (aka. Henry Wheeler Shaw)