1 / 37

Educational Research

Educational Research. Chapter 5 Selecting Measuring Instruments Gay, Mills, and Airasian. Topics Discussed in this Chapter. Data collection Measuring instruments Terminology Interpreting data Types of instruments Technical issues Validity Reliability Selection of a test.

azuka
Télécharger la présentation

Educational Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Educational Research Chapter 5 Selecting Measuring Instruments Gay, Mills, and Airasian

  2. Topics Discussed in this Chapter • Data collection • Measuring instruments • Terminology • Interpreting data • Types of instruments • Technical issues • Validity • Reliability • Selection of a test

  3. Data Collection • Scientific inquiry requires the collection, analysis, and interpretation of data • Data – the pieces of information that are collected to examine the research topic • Issues related to the collection of this information are the focus of this chapter

  4. Data Collection • Terminology related to data • Constructs – abstractions that cannot be observed directly but are helpful when trying to explain behavior • Intelligence • Teacher effectiveness • Self concept Obj. 1.1 & 1.2

  5. Data Collection • Data terminology (continued) • Operational definition – the ways by which constructs are observed and measured • Weschler IQ test • Virgilio Teacher Effectiveness Inventory • Tennessee Self-Concept Scale • Variable – a construct that has been operationalized and has two or more values Obj. 1.1 & 1.2

  6. Data Collection • Measurement scales • Nominal – categories • Gender, ethnicity, etc. • Ordinal – ordered categories • Rank in class, order of finish, etc. • Interval – equal intervals • Test scores, attitude scores, etc. • Ratio – absolute zero • Time, height, weight, etc. Obj. 2.1

  7. Data Collection • Types of variables • Categorical or quantitative • Categorical variables reflect nominal scales and measure the presence of different qualities (e.g., gender, ethnicity, etc.) • Quantitative variables reflect ordinal, interval, or ratio scales and measure different quantities of a variable (e.g., test scores, self-esteem scores, etc.) Obj. 2.2

  8. Data Collection • Types of variables • Independent or dependent • Independent variables are purported causes • Dependent variables are purported effects • Two instructional strategies, co-operative groups and traditional lectures, were used during a three week social studies unit. Students’ exam scores were analyzed for differences between the groups. • The independent variable is the instructional approach (of which there are two levels) • The dependent variable is the students’ achievement Obj. 2.3

  9. Measurement Instruments • Important terms • Instrument – a tool used to collect data • Test – a formal, systematic procedure for gathering information • Assessment – the general process of collecting, synthesizing, and interpreting information • Measurement – the process of quantifying or scoring a subject’s performance Obj. 3.1 & 3.2

  10. Measurement Instruments • Important terms (continued) • Cognitive tests – examining subjects’ thoughts and thought processes • Affective tests – examining subjects’ feelings, interests, attitudes, beliefs, etc. • Standardized tests – tests that are administered, scored, and interpreted in a consistent manner Obj. 3.1

  11. Measurement Instruments • Important terms (continued) • Selected response item format – respondents select answers from a set of alternatives • Multiple choice • True-false • Matching • Supply response item format – respondents construct answers • Short answer • Completion • Essay Obj. 3.3 & 11.3

  12. Measurement Instruments • Important terms (continued) • Individual tests – tests administered on an individual basis • Group tests – tests administered to a group of subjects at the same time • Performance assessments – assessments that focus on processes or products that have been created Obj. 3.6

  13. Measurement Instruments • Interpreting data • Raw scores – the actual score made on a test • Standard scores – statistical transformations of raw scores • Percentiles (0.00 – 99.9) • Stanines (1 – 9) • Normal Curve Equivalents (0.00 – 99.99) Obj. 3.4

  14. Measurement Instruments • Interpreting data (continued) • Norm-referenced – scores are interpreted relative to the scores of others taking the test • Criterion-referenced – scores are interpreted relative to a predetermined level of performance • Self-referenced – scores are interpreted relative to changes over time Obj. 3.5

  15. Measurement Instruments • Types of instruments • Cognitive – measuring intellectual processes such as thinking, memorizing, problem solving, analyzing, or reasoning • Achievement – measuring what students already know • Aptitude – measuring general mental ability, usually for predicting future performance Obj. 4.1 & 4.2

  16. Measurement Instruments • Types of instruments (continued) • Affective – assessing individuals’ feelings, values, attitudes, beliefs, etc. • Typical affective characteristics of interest • Values – deeply held beliefs about ideas, persons, or objects • Attitudes – dispositions that are favorable or unfavorable toward things • Interests – inclinations to seek out or participate in particular activities, objects, ideas, etc. • Personality – characteristics that represent a person’s typical behaviors Obj. 4.1 & 4.5

  17. Measurement Instruments • Types of instruments (continued) • Affective (continued) • Scales used for responding to items on affective tests • Likert • Positive or negative statements to which subjects respond on scales such as strongly disagree, disagree, neutral, agree, or strongly agree • Semantic differential • Bipolar adjectives (i.e., two opposite adjectives) with a scale between each adjective • Dislike: ___ ___ ___ ___ ___ :Like • Rating scales – rankings based on how a subject would rate the trait of interest Obj. 5.1

  18. Measurement Instruments • Types of instruments (continued) • Affective (continued) • Scales used for responding to items on affective tests (continued) • Thurstone – statements related to the trait of interest to which subjects agree or disagree • Guttman – statements representing a uni-dimensional trait Obj. 5.1

  19. Measurement Instruments • Issues for cognitive, aptitude, or affective tests • Problems inherent in the use of self-report measures • Bias – distortions of a respondent’s performance or responses based on ethnicity, race, gender, language, etc. • Responses to affective test items • Socially acceptable responses • Accuracy of responses • Response sets • Alternatives include the use of projective tests Obj. 4.3, 4.4

  20. Technical Issues • Two concerns • Validity • Reliability

  21. Technical Issues • Validity – extent to which interpretations made from a test score are appropriate • Characteristics • The most important technical characteristic • Situation specific • Does not refer to the instrument but to the interpretations of scores on the instrument • Best thought of in terms of degree Obj. 6.1 & 7.1

  22. Technical Issues • Validity (continued) • Four types • Content – to what extent does the test measure what it is supposed to measure • Item validity • Sampling validity • Determined by expert judgment Obj. 7.1 & 7.2

  23. Technical Issues • Validity (continued) • Criterion-related • Predictive – to what extent does the test predict a future performance • Concurrent - to what extent does the test predict a performance measured at the same time • Estimated by correlations between two tests • Construct – the extent to which a test measures the construct it represents • Underlying difficulty defining constructs • Estimated in many ways Obj. 7.1, 7.3, & 7.4

  24. Technical Issues • Validity (continued) • Consequential – to what extent are the consequences that occur from the test harmful • Estimated by empirical and expert judgment • Factors affecting validity • Unclear test directions • Confusing and ambiguous test items • Vocabulary that is too difficult for test takers Obj. 7.1, 7.5, & 7.7

  25. Technical Issues • Factors affecting validity (continued) • Overly difficult and complex sentence structure • Inconsistent and subjective scoring • Untaught items • Failure to follow standardized administration procedures • Cheating by the participants or someone teaching to the test items Obj. 7.7

  26. Technical Issues • Reliability – the degree to which a test consistently measures whatever it is measuring • Characteristics • Expressed as a coefficient ranging from 0 to 1 • A necessary but not sufficient characteristic of a test Obj. 6.1, 8.1, & 8.7

  27. Technical Issues • Reliability (continued) • Six reliability coefficients • Stability – consistency over time with the same instrument • Test – retest • Estimated by a correlation between the two administrations of the same test • Equivalence – consistency with two parallel tests administered at the same time • Parallel forms • Estimated by a correlation between the parallel tests Obj. 8.1, 8.2, 8.3, & 8.7

  28. Technical Issues • Reliability (continued) • Six reliability coefficients (continued) • Equivalence and stability – consistency over time with parallel forms of the test • Combines attributes of stability and equivalence • Estimated by a correlation between the parallel forms • Internal consistency – artificially splitting the test into halves • Several coefficients – split halves, KR 20, KR 21, Cronbach alpha • All coefficients provide estimates ranging from 0 to 1 Obj. 8.1, 8.4, 8.5, & 8.7

  29. Technical Issues • Reliability (continued) • Six reliability coefficients • Scorer/rater – consistency of observations between raters • Inter-judge – two observers • Intra-judge – one judge over two occasions • Estimated by percent agreement between observations Obj. 8.1, 8.6, & 8.7

  30. Technical Issues • Reliability (continued) • Six reliability coefficients (continued) • Standard error of measurement (SEM) – an estimate of how much difference there is between a person’s obtained score and his or her true score • Function of the variation of the test and the reliability coefficient (e.g., KR 20, Cronbach alpha, etc.) • Estimated by specifying an interval rather than a point estimate of a person’s score Obj. 8.1, 8.7, & 9.1

  31. Selection of a Test • Sources of test information • Mental Measurement Yearbooks (MMY) • The reviews in MMY are most easily accessed through your university library and the services to which they subscribe (e.g., EBSCO) • Provides factual information on all known tests • Provides objective test reviews • Comprehensive bibliography for specific tests • Indices: titles, acronyms, subject, publishers, developers • Buros Institute Obj. 10.1 & 12.1

  32. Selection of a Test • Sources (continued) • Tests in Print • Tests in Print is a subsidiary of the Buros Institute • The reviews in it are most easily accessed through your university library and the services to which they subscribe (e.g., EBSCO) • Bibliography of all known commercially produced tests currently available • Very useful to determine availability • Tests in Print Obj. 10.1 & 12.1

  33. Selection of a Test • Sources (continued) • ETS Test Collection • Published and unpublished tests • Includes test title, author, publication date, target population, publisher, and description of purpose • Annotated bibliographies on achievement, aptitude, attitude and interests, personality, sensory motor, special populations, vocational/occupational, and miscellaneous • ETS Test Collection Obj. 10.1 &12.1

  34. Selection of a Test • Sources (continued) • Professional journals • Test publishers and distributors • Issues to consider when selecting tests • Psychometric properties • Validity • Reliability • Length of test • Scoring and score interpretation Obj. 10.1, 11.1, & 12.1

  35. Selection of a Test • Issues to consider when selecting tests • Non-psychometric issues • Cost • Administrative time • Objections to content by parents or others • Duplication of testing Obj. 11.1

  36. Selection of a Test • Designing your own tests • Get help from others with experience in developing tests • Item writing guidelines • Avoid ambiguous and confusing wording and sentence structure • Use appropriate vocabulary • Write items that have only one correct answer • Give information about the nature of the desired answer • Do not provide clues to the correct answer • See Writing Multiple Choice Items Obj. 11.2

  37. Selection of a Test • Test administration guidelines • Plan ahead • Be certain that there is consistency across testing sessions • Be familiar with any and all procedures necessary to administer a test Obj. 11.4

More Related