Créer une présentation
Télécharger la présentation

Télécharger la présentation
## Cognitive Diagnosis as Evidentiary Argument

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Cognitive Diagnosis as Evidentiary Argument**Robert J. Mislevy Department of Measurement, Statistics, & Evaluation University of Maryland, College Park, MD October 21, 2004 Presented at the Fourth Spearman Conference, Philadelphia, PA, Oct. 21-23, 2004. Thanks to Russell Almond, Charles Davis, Chun-Wei Huang, Sandip Sinharay, Linda Steinberg, Kikumi Tatsuioka, David Williamson, and Duanli Yan.**Introduction**• An assessment is a particular kind of evidentiary argument. • Parsing a particular assessment in terms of the elements of an argument provides insights into more visible features such as tasks and statistical models. • Will look at cognitive diagnosis from this perspective.**Toulmin's (1958) structure for arguments**Reasoning flows from data (D) to claim (C) by justification of a warrant (W), which in turn is supported by backing (B). The inference may need to be qualified by alternative explanations (A), which may have rebuttal evidence (R) to support them.**Specialization to assessment**• The role of psychological theory: • Nature of claims & data • Warrant connecting claims and data: “If student were x, would probably do y” • The role of probability-based inference: “Student does y; what is support for x’s?” • Will look first at assessment under behavioral perspective, then see how cognitive diagnosis extends the ideas.**Behaviorist Perspective**The evaluation of the success of instruction and of the student’s learning becomes a matter of placing the student in a sample of situations in which the different learned behaviors may appropriately occur and noting the frequency and accuracy with which they do occur. D.R. Krathwohl & D.A. Payne, 1971, p. 17-18.**: Sue's probability of**C correctly answering a 2- digit subtraction problem p with borrowing is W :Sampling theory machinery unless : [e.g., observational A for reasoning from true errors, data errors, proportion for correct since misclassification of n responses in targeted responses or . situations to observed counts performance situations, distractions, etc.] so and D1j D2j structure : Sue's D11 D2j : Sue's structure D11 D2j : Sue's structure answer to answer to and contents and contents answer to Item j and contents Item j of Item j Item j of Item j of Item j The claim addresses the expected value of performance of the targeted kind in the targeted situations.**: Sue's probability of**C correctly answering a 2- digit subtraction problem p with borrowing is W :Sampling theory machinery unless : [e.g., observational A for reasoning from true errors, data errors, proportion for correct since misclassification of n responses in targeted responses or . situations to observed counts performance situations, distractions, etc.] so and D1j D2j structure : Sue's D11 D2j : Sue's structure D11 D2j : Sue's structure answer to answer to and contents and contents answer to Item j and contents Item j of Item j Item j of Item j of Item j The student data address the salient features of the responses.**: Sue's probability of**C correctly answering a 2- digit subtraction problem p with borrowing is W :Sampling theory machinery unless : [e.g., observational A for reasoning from true errors, data errors, proportion for correct since misclassification of n responses in targeted responses or . situations to observed counts performance situations, distractions, etc.] so and D1j D2j structure : Sue's D11 D2j : Sue's structure D11 D2j : Sue's structure answer to answer to and contents and contents answer to Item j and contents Item j of Item j Item j of Item j of Item j The task data address the salient features of the stimulus situations (i.e., tasks).**: Sue's probability of**C correctly answering a 2- digit subtraction problem p with borrowing is W :Sampling theory machinery unless : [e.g., observational A for reasoning from true errors, data errors, proportion for correct since misclassification of n responses in targeted responses or . situations to observed counts performance situations, distractions, etc.] so and D1j D2j structure : Sue's D11 D2j : Sue's structure D11 D2j : Sue's structure answer to answer to and contents and contents answer to Item j and contents Item j of Item j Item j of Item j of Item j The warrant encompasses definitions of the class of stimulus situations, response classifications, and sampling theory.**Statistical Modeling of Assessment Data**• Claims in terms of values of unobservable variables in student model (SM)--characterize student knowledge. • Data modeled as depending probabilistically on SM vars. • Estimate conditional distributions of data given SM vars. • Bayes theorem to infer SM variables given data.**Specialization to cognitive diagnosis**• Information-processing perspective foregrounded in cognitive diagnosis • Student model contains variables in terms of, e.g., • Production rules at some grain-size • Components / organization of knowledge • Possibly strategy availability / usage • Importance of purpose**Responses consistent with the"subtract smaller from larger"**bug “Buggy arithmentic”: Brown & Burton (1978); VanLehn (1990)**Some Illustrative Student Models in Cognitive Diagnosis**• Whole number subtraction: • ~ 200 production rules (VanLehn, 1990) • Can model at level of bugs (Brown & Burton) or at the level of impasses (VanLehn) • John Anderson’s ITSs in algebra, LISP • ~ 1000 production rules • 1-10 in play at a given time • Reverse-engineered large-scale tests • ~10-15 skills • Mixed number subtraction (Tatsuoka) • ~5-15 production rules / skills**Mixed number subtraction**• Based on example from Prof. Kikumi Tatsuoka (1982). • Cognitive analysis & task design • Methods A & B • Overlapping sets of skills under methods • Bayes nets described in Mislevy (1994): • Five “skills” required under Method B. • Conjunctive combination of skills • DINA stochastic model**Skill 1: Basic fraction subtraction**Skill 2: Simplify/Reduce Skill 3: Separate whole number from fraction Skill 4: Borrow from whole number Skill 5: Convert whole number to fractions**C:**Sue's configuration of production rules for operating in the domain (knowledge and skill) is K W0: Theory about how persons with since configurations { K 1,..., K m} would be likely to respond to items with so different salient features. and : Sue's probability of : Sue's probability of C C ... answering a Class 1 answering a Class n subtraction problem with subtraction problem with borrowing is p 1 borrowing is p n :Sampling :Sampling W W theory theory since since for items with for items with feature set feature set so so defining Class 1 defining Class n and and : : D11j D21j D1nj D2nj Sue's Sue's structure structure D11 D2j D11 D2j D11 D2j D11 D2j ... answer to answer to and contents and contents Item j, Class 1 Item j, Class n of Item j, Class1 of Item j, Class n of Item j of Item j of Item j of Item j**C:**Sue's configuration of production rules for operating in the domain (knowledge and skill) is K W0: Theory about how persons with since configurations { K 1,..., K m} would be likely to respond to items with so different salient features. and : Sue's probability of : Sue's probability of C C ... answering a Class 1 answering a Class n subtraction problem with subtraction problem with Like behaviorist inference at level of behavior in classes of structurally similar tasks. borrowing is p 1 borrowing is p n :Sampling :Sampling W W theory theory since since for items with for items with feature set feature set so so defining Class 1 defining Class n and and : : D11j D21j D1nj D2nj Sue's Sue's structure structure D11 D2j D11 D2j D11 D2j D11 D2j ... answer to answer to and contents and contents Item j, Class 1 Item j, Class n of Item j, Class1 of Item j, Class n of Item j of Item j of Item j of Item j**C:**Sue's configuration of production rules for operating in the domain (knowledge and skill) is K W0: Theory about how persons with since configurations { K 1,..., K m} would be likely to respond to items with so different salient features. and : Sue's probability of : Sue's probability of C C ... answering a Class 1 answering a Class n subtraction problem with subtraction problem with borrowing is p 1 borrowing is p n Structural patterns among behaviorist claims are data for inferences about unobservable production rules that govern behavior. :Sampling :Sampling W W theory theory since since for items with for items with feature set feature set so so defining Class 1 defining Class n and and : : D11j D21j D1nj D2nj Sue's Sue's structure structure D11 D2j D11 D2j D11 D2j D11 D2j ... answer to answer to and contents and contents Item j, Class 1 Item j, Class n of Item j, Class1 of Item j, Class n of Item j of Item j of Item j of Item j**C:**Sue's configuration of production rules for operating in the domain (knowledge and skill) is K W0: Theory about how persons with since configurations { K 1,..., K m} would be likely to respond to items with so different salient features. and : Sue's probability of : Sue's probability of C C ... answering a Class 1 answering a Class n subtraction problem with subtraction problem with borrowing is p 1 borrowing is p n :Sampling :Sampling W W • This level distinguishes cognitive diagnosis from subscores. • A typical (but not necessary) difference is that cognitive diagnosis has many-to-many relationship between observable variables and student-model variables. As partitions, subscores have 1-1 relationships between scores and inferential targets. theory theory since since for items with for items with feature set feature set so so defining Class 1 defining Class n and and : : D11j D21j D1nj D2nj Sue's Sue's structure structure D11 D2j D11 D2j D11 D2j D11 D2j ... answer to answer to and contents and contents Item j, Class 1 Item j, Class n of Item j, Class1 of Item j, Class n of Item j of Item j of Item j of Item j**Structural and stochastic aspects of inferential models**• Structural model relates student model variables (qs) to observable variables (xs) • Conjunctive, disjunctive, mixture • Complete vs incomplete (e.g., fusion model) • The Q matrix (next slide) • Stochastic model addresses uncertainty • Rule based; logical with noise • Probability-based inference (discrete Bayes nets, extended IRT models) • Hybrid (e.g., Rule Space)**The Q-matrix (Fischer, Tatsuoka)**Features Items • qjk is extent Feature k pertains to Item j • Special case: 0/1 entries and a 1-1 relationship between features and student-model variables.**Conjunctive structural relationship**• Person i: qi = (qi1, qi2, …, qiK) • Each qik =1 if person possesses “skill”, 0 if not. • Task j: qj= (qj1, qj2, …, qjK) • A qjk= 1 if item j “requires skill k”, 0 if not. • Iij = 1 if (qjk =1 Þqik =1) for allk, 0 if (qjk =1 butqik =0) for anyk.**Conjunctive structural relationship:No stochastic model**• Pr(xij =1| qi , qj ) = Iij • No uncertainty about x given q. • There is uncertainty about q given x, even if no stochastic part, due to competing explanations (Falmagne): xij = {0,1} just gives you partitioning into all qs that cover of qj, vs. those that miss with respect to at least one skill.**Conjunctive structural relationship:DINA stochastic model**• Now there is uncertainty about x given q: Pr(xij =1| Iij =0) = pj0 -- False positive Pr(xij =1| Iij =1) = pj1 -- True positive • Likelihood over n items: • Posterior :**The particular challenge of competing explanations**• Triangulation • Different combinations of data fail to support some alternative explanations of responses, and reinforce others. • Why was an item requiring Skills 1 & 2 wrong? • Missing Skill 1? Missing Skill 2? A slip? • Try items requiring 1 & 3, 2 & 4, 1& 2 again. • Degree design supports inferences • Test design as experimental design**Basic fraction**subtraction Bayes net for mixed number subtraction(Method B) (Skill 1) 6/7 - 4/7 Item 6 2/3 - 2/3 Item 8 Simplify/reduce (Skill 2) Convert whole number to fraction (Skill 5) Mixed number skills Separate whole Borrow from Structural aspects: The logical conjunctive relationships among skills, and which sets of skills an item requires. Latter determined by its qj vector. number from whole number fraction (Skill 4) (Skill 3) Skills 1 & 3 3 2 4 1 7/8 - 5/7 - 4/7 Skills 1, 3, & Skills 1 & 2 Item 9 Item 16 4 3 3 4/5 - 2/5 Item 14 Skills Skills 1, 3, 4, 7 11/8 - 1/8 3/5 - 4/5 1,2,3,&4 & 5 Item 17 Item 12 Skills 1, 2, 3, 3 2 4 2 4 1 2 1/2 - 3/2 1/3 - 4/3 1/3 - 5/3 - 1/3 4, & 5 Item 4 Item 11 Item 20 Item 15 4 2 4 2 4/12 - 7/12 1/10 - 8/10 Item 10 Item 18 3 2 4 3 - 1/5 - 4/3 Item 7 Item 19**Basic fraction**subtraction Bayes net for mixed number subtraction(Method B) (Skill 1) 6/7 - 4/7 Item 6 2/3 - 2/3 Item 8 Simplify/reduce (Skill 2) Convert whole number to fraction (Skill 5) Mixed number skills Separate whole Borrow from number from whole number fraction (Skill 4) (Skill 3) Stochastic aspects, Part 1: Empirical relationships among skills in population (red). Skills 1 & 3 3 2 4 1 7/8 - 5/7 - 4/7 Skills 1, 3, & Skills 1 & 2 Item 9 Item 16 4 3 3 4/5 - 2/5 Item 14 Skills Skills 1, 3, 4, 7 11/8 - 1/8 3/5 - 4/5 1,2,3,&4 & 5 Item 17 Item 12 Skills 1, 2, 3, 3 2 4 2 4 1 2 1/2 - 3/2 1/3 - 4/3 1/3 - 5/3 - 1/3 4, & 5 Item 4 Item 11 Item 20 Item 15 4 2 4 2 4/12 - 7/12 1/10 - 8/10 Item 10 Item 18 3 2 4 3 - 1/5 - 4/3 Item 7 Item 19**Basic fraction**subtraction Bayes net for mixed number subtraction(Method B) (Skill 1) 6/7 - 4/7 Item 6 2/3 - 2/3 Item 8 Simplify/reduce (Skill 2) Convert whole number to fraction (Skill 5) Mixed number skills Separate whole Borrow from number from whole number fraction (Skill 4) (Skill 3) Stochastic aspects, Part 2: Measurement errors for each item (yellow). Skills 1 & 3 3 2 4 1 7/8 - 5/7 - 4/7 Skills 1, 3, & Skills 1 & 2 Item 9 Item 16 4 3 3 4/5 - 2/5 Item 14 Skills Skills 1, 3, 4, 7 11/8 - 1/8 3/5 - 4/5 1,2,3,&4 & 5 Item 17 Item 12 Skills 1, 2, 3, 3 2 4 2 4 1 2 1/2 - 3/2 1/3 - 4/3 1/3 - 5/3 - 1/3 4, & 5 Item 4 Item 11 Item 20 Item 15 4 2 4 2 4/12 - 7/12 1/10 - 8/10 Item 10 Item 18 3 2 4 3 - 1/5 - 4/3 Item 7 Item 19**Bayes net for mixed number subtraction**Probabilities before observations**Bayes net for mixed number subtraction**Probabilities after observations**Bayes net for mixed number subtraction**For mixture of strategies across people**Extensions (1)**• More general … • Student models (continuous vars, uses) • Observable variables (richer, times, multiple) • Structural relationships (e.g., disjuncts) • Stochastic relationships (e.g., NIDA, fusion) • Model-tracing temporary structures (VanLehn)**Extensions (2)**• Strategy use • Single strategy (as discussed above) • Mixture across people (Rost, Mislevy) • Mixtures within people (Huang: MV Rasch) • Huang’s example of last of these follows…**What are the forces at the instant of impact?**20 mph 20 mph • A. The truck exerts the same amount of force on the car as the car exerts on the truck. • B. The car exerts more force on the truck than the truck exerts on the car. • C. The truck exerts more force on the car than the car exerts on the truck. • D. There’s no force because they both stop.**What are the forces at the instant of impact?**10 mph 20 mph • A. The truck exerts the same amount of force on the car as the car exerts on the truck. • B. The car exerts more force on the truck than the truck exerts on the car. • C. The truck exerts more force on the car than the car exerts on the truck. • D. There’s no force because they both stop.**What are the forces at the instant of impact?**10 mph 1 mph • A. The truck exerts the same amount of force on the fly as the fly exerts on the truck. • B. The fly exerts more force on the truck than the truck exerts on the fly . • C. The truck exerts more force on the fly than the fly exerts on the truck. • D. There’s no force because they both stop.**The Andersen/Rasch Multidimensional Model for m strategy**categories is an integer between 1 and m; is the strategy person i uses for item j; is the pth element in the person i’s vector-valued parameter; is the pth element in the item j’s vector-valued parameter.**Conclusion: The Importance of Coordination…**• Among psychological model, task design, and analytic model • (KWSK “assessment triangle”) • Tatsuoka’s work is exemplary in this respect: • Grounded in psychological analyses • Grainsize & character tuned to learning model • Test design tuned to instructional options**Conclusion: The Importance of Coordination…**• With purpose, constraints, resources • Lower expectations for retrofitting existing tests designed for different purposes, under different perspectives & warrants. • Information & Communication Technology (ICT) project at ETS • Simulation-based tasks • Large scale • Forward design