1 / 51

Meta-analysis and the Synthetic Approach

Meta-analysis and the Synthetic Approach. Luke Plonsky Current Developments in Quantitative Research Methods Day 2. Traditional Literature Reviews. What do they look like? Think of a recent one you wrote: What was your process like? What are their strengths? Weaknesses?

ula
Télécharger la présentation

Meta-analysis and the Synthetic Approach

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Meta-analysis and the Synthetic Approach Luke Plonsky Current Developments in Quantitative Research Methods Day 2

  2. Traditional Literature Reviews • What do they look like? • Think of a recent one you wrote: What was your process like? • What are their strengths? Weaknesses? (As we discuss the meta-analytic process, keep a topic or domain of yours in mind.)

  3. Meta-analysis as “the way forward”? (Rousseau, 2008, p. 9) Systematic, transparent, & quantitative means to • Summarize (all) previous studies (A  B; M x N) • Provide a quantitative indication of a relationship • Prevent over/under-interpreting results (Norris & Ortega, 2006; Rousseau, 2008) • Increase statistical power and generalizability across learners, contexts, L2 features, outcomes, etc. (Plonsky, 2012) • Examine relationships not visible in primary research (A on B when C vs. D) • Identify substantive and methodological trends, weaknesses, and gaps (Plonsky & Gass, 2011)

  4. Meta-analysis is here! advance theory, research, and practice Understand/evaluate choices +visibility +impact +citation (Cooper & Hedges, 2009) (See Norris & Ortega, 2010; Oswald & Plonsky, 2010)

  5. Judgment and Decision-Making Norris & Ortega (2007) Art and Science Oswald & McCloy (2003) “There doesn’t seem to be a big role in this kind of work for much intelligent statistics, opposed to much wise thought”(Wachter, 1990, p. 182). vs.

  6. Four major stages(parallel to primary research) 1. Defining the domain / locating primary studies 2. Developing and implementing a coding scheme 3. (Meta-)Analysis 4. Interpreting meta-analytic results

  7. 1. DEFINING THE DOMAIN / LOCATING PRIMARY STUDIES

  8. 1. Defining the domain / locating primary studies:Methodological considerations • “Best evidence synthesis” (Eysenck, 1995) • Truscott (2007) – strict criteria (e.g., only “long-term” treatments) • Vs. Inclusiveness (preferred) (Norris & Ortega, 2006; Plonsky & Oswald, 2012) • Weaknesses mitigated by volume and assessed empirically (e.g., Russell & Spada, 2006) • Reliability reported? • Yes, d = 0.65; No, d = 0.42 (Plonsky, 2011) • Control for bias? • Tight, d = 0.51; Loose, d = 0.38 (Adesope et al., 2010) (Are there studies with certain methodological features that you would exclude?)

  9. 1. Defining the domain / locating primary studies:Publication status (& bias) • Exclude unpublished studies (e.g., Keck et al., 2006; Lyster & Saito, 2010; Mackey & Goo, 2007) • failsafe n (Abraham, 2008; Ross, 1998) lacking precision (e.g., Becker, 2005) • funnel plot (Li, 2010; Norris & Ortega, 2000; Plonsky, 2011) • Include unpublished studies (e.g., Li, 2010; Masgoret & Gardner, 2003, Won, 2008) • Compare Published (g = 0.43) vs. unpublished (g = 0.56)(Taylor et al., 2006)

  10. 1. Defining the domain / locating primary studies:Substantive considerations Broad Narrow (local) Strategy instruction (reading only; Taylor et al., 2006) Collocation instruction + tech.(Nurmukhamedov, in preparation) • Strategy instruction (all skills; Plonsky, 2011) • Multi-word instruction (all types) (Han, in preparation) (Would you describe your domain as relatively broad or more narrow? If narrow, what broader domain does your belong to?)

  11. The Effectiveness of Bilingual Education • Willig (1985) K = 23 • d = .63 • Rossell & Baker (1996) K = 72 (the “naysayers”; 228 unacceptable) • Vote: % of studies helpful (22%), no diff (45%), harmful (33%) • Greene (1998) K = 11 • g = .18 (quasi-exp) / .26 (experiments); no Canada • Slavin & Cheung (2003) K = 42; “best-evidence synthesis” • No overall d; many subgroups • Roessingh(2004) K = 12 • Qual. synthesis; HS learners only; Canadian focus • Rolstad, Mahoney, & Glass (2005) K = 17 (all post-Willig, 1985) • dL2 = .23 (usually English); dL1 = .86 • Reljić (2011) K = 7 • European studies only; d = ? Strict / convenient? quality criteria (See also Rossell & Kuder’s [2005] meticulous critique and re-analysis of these studies.)

  12. How effective is feedback? (Well, it depends…) Corrective Feedback ?

  13. How effective is feedback? (Well, it depends…) Corrective Feedback ? d=-.15 (Effects of CF not calculated) d=1.16

  14. 1. Defining the domain / locating primary studies:Search Strategies a. Database searches (e.g., LLBA, ERIC, PsycInfo) (see In’nami & Koizumi, 2010; Plonsky & Brown, under review) b. Forward citations (Google/Scholar, Web of Science)(Plonsky, 2011) c. Manual journal searches (Keck et al., 2006; Plonsky & Gass, 2011) d. Textbooks and edited volumes e. Conference proceedings (15 in Lee et al., in press) f. Reference digging (‘ancestry’) g. Dissertations/theses (10 in Li, 2010; 19 in Lee et al., in press) h. Previous reviews (e.g., ARAL) i. Researchers’ websites, online bibliographies, listservs j. Contacting authors k. others? l. All of the above

  15. 1. Defining the domain / locating primary studies:Search Strategies Narrow range of search techniques completeness+redundancy > incompleteness (in Plonsky & Brown, under review)

  16. 2. CODING

  17. 2. Developing and implementing a coding scheme (the data collection instrument) Knowledge of… • Substantive issues, relevant models, variables • e.g., Taxonomies of instruction, CF  moderators • e.g., What constitutes a multi-word unit? Collocation? (Han, in prep; Nurmukhamedov, in prep.)  moderators • Research design(s) used • Pre-post? Control-experimental only? • Classroom/lab, FL/SL, correlational/experimental, length of treatment, researcher- or teacher-led, outcome measures…  more moderators • Methodological features (for analysis of study quality)

  18. 2. Developing and implementing a coding scheme Typically 5 different types of data are coded • Identification (year, author) • Sample and context (age, L1, L2, proficiency) • Design (pre-post/control-experimental, treatment features) • Outcome features (free response, constrained response) • Outcomes / effect sizes (r, d) • Coding scheme example: Lee, Jang, & Plonsky (in press) • Recommendations: • code variables numerically/categorically whenever possible • revise and add new variables as they emerge from coding (Which type of index would be most appropriate for your research/domain?) (What types of substantive and methodological features would you code for?)

  19. 2. Developing and implementing a coding scheme (cont’d) Decisions about… • Interrater reliability • Especially for high-inference items (e.g., L2 proficiency; task-essentialness) • Percentage agreement; Cohen’s kappa • Missing data (e.g., SDs  VERY common: 31% in Plonsky & Gass, 2011) 1. Ignore/exclude (most common) 2. Impute (i.e., estimate) 3. Request (5/15 and 5/16 sent data in Plonsky, 2011, and Lee et al., in press, respectively)

  20. 3. (META-)ANALYSIS

  21. 3. (Meta-)Analysis • Potentially very simple: Overall d = M(study1, study2, …) • Level of analysis (e.g., study?, sample?, within vs. between groups?) • Pre-post ESs generally larger than control-experimental ones • Weighting/adjusting ESs for quality, statistical artifacts • N(Norris & Ortega, 2000; Plonsky, 2011), inverse variance (Won, 2008) • “Schmidt & Hunter” corrections (Jeon & Yamashita, under review; Masgoret & Gardner, 2003) • Quality/control (e.g., random assignment, pretesting) • Example/template for ES weighting (N; inverse variance)

  22. 3. (Meta-)Analysis Totally essential! (and awesome) • Overall / mean (d, r) • “adds as well as summarizes knowledge” (Hall et al., 1994, p. 24) Moderator analyses (explain variance across studies): - Ross, 1998: listening; reading - Norris & Ortega, 2000: +explicitness; +constrained measures - Mackey & Goo, 2007: vocab > grammar - Li, 2010: labs > classrooms - Plonsky, 2011: longer treatments; fewer strategies; R & S - Lee et al., in press.: instruction + feedback; longer treatments (Example of moderator analyses using SPSS)

  23. 3. (Meta-)Analysis: Treatment types as moderators Plonsky, 2011

  24. 3. (Meta-)Analysis: Outcome measures as moderators Norris & Ortega, 2000

  25. 3. (Meta-)Analysis: Multiple Moderators Spada & Tomita, 2010

  26. 3. (Meta-)Analysis: Treatment length as a moderator S L B S M L B S-M L (Lyster & Saito, 2010) (Norris & Ortega, 2000) (Jeon & Kaya, 2006)

  27. 3. (Meta-)Analysis More advanced (meta-)analytic / techniques • Fixed vs. random effects modeling • Bayesian meta-analysis (see Ross, 2013) • Meta-regression • Meta-SEM (See Borenstein et al., 2009; Cooper, Hedges, & Valentine, 2009)

  28. 4. INTERPRETING RESULTS

  29. What do they mean anyway? How big is ‘big’? And how small is ‘small’? What does d = 0.50 (or 0.10, or 1.00…) mean? What implications do these effect have for future research, theory, and practice? SMALL BIG

  30. 4. Interpreting findings(Plonsky & Oswald, under review) Lab Classroom • General and field-specific benchmarks (Cohen, 1988; Plonsky & Oswald, under review) • Previous/similar meta-analyses in AL (e.g., Abraham, 2008; Lee et al., this colloquium; Mackey & Goo, 2007) • meta-analyses in other fields(Plonsky, 2011) • SDunits(Taylor et al., 2006) • Setting (e.g., Li, 2010; Mackey & Goo, 2007) • Length/intensity, practicality (Lee & Huang, 2008; Leeet al., in press; Lyster & Saito, 2010; Norris & Ortega, 2000) • Study quality(Plonsky, 2011, 2013, in press; Plonsky & Gass, 2011)

  31. Cohen’s (1988) “t-shirt” effect sizes • ESs are best understood in relation to a particular discipline and, ideally, within a particular sub-domain of that discipline (e.g., Cohen, 1988; Valentine & Cooper, 2003) d = 0.80 d = 0.50 d = 0.20

  32. d linguistics = economics = social work = …?

  33. d values across 77 L2 meta-analyses(1,733 studies, N = 452,000+; Plonsky & Oswald, under review) 1.00 ≈Large(ish) 0.70 ≈Medium(ish) M = 0.63 0.40 ≈Small(ish)

  34. d values across 236 primary L2 studies - 5 - 4 1.07 75th percentile large-ish - 3 0.71 50th percentile medium-ish - 2 - 1 0.45 25th percentile small-ish - 0

  35. d values across 236 primary L2 studies - 5 - 4 1.07 75th percentile large-ish 1.00 ≈ Large - 3 0.70 ≈ Medium M = 0.63 0.71 50th percentile medium-ish - 2 0.40 ≈ Small - 1 0.45 25th percentile small-ish - 0 35

  36. Additional Considerations: TheoreticalMaturity Example: d = 0.42, SD = 0.24, k = 46 -fine-grained analyses ES (d) +fine-grained analyses Year

  37. Additional Considerations: MethodologicalMaturity Example: d = 0.42, SD = 0.24, k = 46 +refined methods and instruments ES (d) -refined methods and instruments Year

  38. Additional Considerations: Theoretical & Methodological Maturity Example: d = 0.42, SD = 0.24, K = 92 -fine-grained analyses Where is your study? +refined methods and instruments ES (d) +fine-grained analyses -refined methods and instruments Year

  39. ESs Over TimePlonsky & Gass (2011) Average Effect Sizes across Three Decades Decade Effect Size (d)

  40. (Literal/Mathematical) SD Units • Example: d= 0.73; the average EG participant outscored the average CG participant by about 3/4 a SD

  41. Additional Considerations: Research Setting Lab vs. Classroom *Setting may change over time: L2 interaction (Plonsky & Gass, 2011) - 1980s ≈ 80% lab-based - 1990s-2000s ≈ 50/50% lab/classroom FL vs. SL (Mackey & Goo, 2007) (Plonsky, 2011) Li (2010) (Taylor et al., 2006)

  42. Additional Considerations: Length/Intensity of Treatment(Practicality?) S L B S M L B S-M L (Lyster & Saito, 2010) (Jeon & Kaya, 2006) (Norris & Ortega, 2000)

  43. Additional Considerations: Manipulation of IVs(Practicality?) • Lee & Huang (2008) • The effect of input enhancement on L2 grammar learning: d = 0.22 • Numerically small, but practicallylarge/significant?

  44. Additional Considerations: Publication Bias, Sample Sizes, & Sampling Error • Pub. bias: The tendency only to publish studies with statistically significant (or theoretically appealing) findings (Rothstein, Sutton, & Borenstein, 2005; see Plonsky, 2013; Lee, Jang, & Plonsky, in press, for evidence of publication bias in L2 research.) vs. Two related statistical artifacts: 1. Smaller Ns  +sampling error  +variance/distance from population mean 2. Low instrument reliability  smaller effects

  45. Challenges to meta-analysis What challenges might one encounter in conducting a meta-analysis in your target domain and/or generally? 1) Domain maturity • age, breadth and depth of research • danger of pre-mature closure 2) Poor reporting practices (SDs, ESs) • Missing data (K = 19 in Nekrasova & Becker, 2009; 22 in Plonsky, 2011) 3) Instrument reliability low or unreported • Reported in 6% of studies (Nekrasova & Becker, 2009) 4) Idiosyncratic/inconsistent research activity 5) Very few replications (see Polio & Gass, 1997; Porte, 2002, 2012)

  46. Challenges to meta-analysis (cont.) 6) Disagreement over definitions and operationalizations • E.g., noticing • Perhaps more “adversarial collaboration” is needed(see Tetlock & Mitchell, 2009) 7) Overreliance on individual studies(see Norris & Ortega, 2007) 8) Bias of primary (and secondary) researchers toward particular types of findings (e.g., in favor/against theory X; p < .05) 9) Tradition of overreliance on NHST(see Schmidt & Hunter, 2002) • Crude • Uninformative • Unreliable

  47. A synthetic approach to primary research? • What might this look like generally and in terms of… • Research agendas? • Reporting practices and interpretations of findings? • Researcher training? • Journal calls and acceptance policies?

  48. Conclusion: Judgment and decision-making play a major role in all meta-analyses Understanding the choices More appropriate execution and interpretation of meta-analytic findings More precise advances in theory, more efficient L2 research, and more accurately informed practice

  49. Further Reading • Synthesizing research on language learning and teaching (Norris & Ortega, 2006) • Research synthesis and meta-analysis: A step-by-step approach (Cooper, 2010) • Practical meta-analysis (Lipsey & Wilson, 2001)

  50. Connections to Other Topics to be Discussed this Week • NHST, effect sizes (MONDAY) • Study Quality (WEDNESDAY) • Replication (THURSDAY) • Reporting practices (FRIDAY)

More Related