1 / 54

Theme 10 Evaluation

Theme 10 Evaluation. In this theme we discuss in detail the topic “ evaluation ” . This is a comprehensive and a complex theme . Therefore , during this session, we discuss only a first part of the overall theme . Evaluation. Definition. Quality criteria.

gyan
Télécharger la présentation

Theme 10 Evaluation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Theme10Evaluation

  2. In thistheme we discuss in detail the topic “evaluation”. This is a comprehensiveand a complex theme. Therefore, duringthissession, we discussonly a first part of the overall theme.

  3. Evaluation Definition Quality criteria Trends in evaluation: Dimensions Measuring Validity Aggregation level Valueing Reliability Functionsevaluation? Scoring Authenticity Who is responsible Recency Whentoevaluate? Evaluation techniques In relationto “Evaluation”, we willdiscussthreemainthemes andrelatedsubthemes.

  4. Evaluation Definition Quality criteria Trends in evaluation: Dimensions Measuring Validity Aggregation level Valueing Reliability Functionsevaluation? Scoring Authenticity Who is responsible Recency Whentoevaluate? Evaluation techniques We start with a focus on the first mainthemeabout the definition and the concept of evaluation.

  5. Evaluation: the concept Defining the concept evaluation is a difficultissue since the concept itselfonlyemphasizesone aspect of whatevaluationfullyembraces; namely the “giving a value” towhat is beingobserved . As we willsee, italsodoes not help toreplace the concept byotherpopularconcepts, such as “assessment”. Again, onlyoneparticular aspect of the wholeprocess is beingemphasized.

  6. Evaluation: the concept Read the followingdescription of evaluation: “Evaluation is the entireprocess of collecting, analysingandinterpretinginformationaboutpotentiallyevery aspect of aninstructionalactivity, with the aim of givingconclusionsabout the efficacy, efficiency and or anyother impact” (Thorpe, 1988). Youcanobservethatevaluation is a comprehensiveprocessthatcanberelatedtopotentiallyevery element in oureducational frame of reference.

  7. Evaluation: the concept In the literature, an important distinction is made betweenevaluationand assessment. • Assessment or “measuring” refersto the process of collectingandanalysing information (Burke, 1999 en Feden & Vogel, 2004) • Evaluationrefersto, as statedearlier, adding a valuetowhat has been collectedandanalyzed in view of comingto a conclusionabout the efficacy, efficiency or anyother impact.

  8. Evaluation: the concept But in the literature, an even more detailed distinction is made between: • Measuring/testing: collecting information • Evaluating/valuing: what is this information worth? • Scoring/grading: depending the « worth », what score will we give It is essentialtodistinguish these three approaches. Onecanmeasure without valuingor scoring. Andonecannotscore without collectingandvaluinginformation.

  9. Evaluation Definition Quality criteria Trends in evaluation: Dimensions Measuring Validity Aggregation level Valueing Reliability Functionsevaluation? Scoring Authenticity Who is responsible Recency Whentoevaluate? Evaluation techniques We now move to the second mainthemethat centers on quality criteria

  10. Evaluation: qualityrequirements Prior to a discussion of recent developments in the field of evaluation, we first deal withsomecriticalqualityrequirementsthat are central in discussionsaboutevaluation: • Validity • Reliability • Authenticity • Recency Evaluation Quality criteria Validity Reliability Authenticity Recency

  11. Validity Validityrefersto the extentthat the content of what is beingmeasured, valuedandscored is relatedto the initialevaluationobjective. Typicalquestionsthat are raised in this context are: • Whatif we onlymeasuregeometry, when we want tocometoconclusionsaboutmathematicsperformance in primary school? • Whatif we only get questionsfromchapter5 duringanexam? • Whatif we onlyaskmemorizationquestions in atest when we alsoworked in the laboratoryandsolvedchemistryproblems? Evaluation Quality criteria Validity Reliability Authenticity Recency

  12. Reliability Reliabilityrefersto the extentourmeasurement is stable. Typicalquestionsraised are: • If I repeat the same test tomorrow, willI get the sameresults (stability)? • Is there a large difference in the abilitytosolve the different questionsabout the same topic (internalconsistency)? • Ifsomeoneelsemeasured, valuedandscored the test, would he/she end up with the sameresults? Evaluation Quality criteria Validity Reliability Authenticity Recency

  13. Authenticity Authenticityrefersto the extent the information we gather, mirrors in a relevant, adequate, andauthentic way reality. Examples of relatedquestions: • Is itsufficienttoask student nurses togiveinjections on a dolltoevaluatetheirinjection skills? • Is it adequate togive a flyinglicensetosomeonewho was onlytested in a flight simulator? • Is itsufficientto say thatone is ableto “teach”afterevaluating his/her capacitieswith small group teaching? Evaluation Quality criteria Validity Reliability Authenticity Recency

  14. Recency Recencyquestions the “date” information has been collected, valued or scored in view of evaluation: • Can we accept creditsobtained 5 yearsagofromsomeonewhoasksbeingreleaved of courses in a new study program? • Can we hire a young house motherwhogot her degree 10 yearsago? • Are the Basic Life Support Skills masteredsixmonthsago, still relevant today in anactivefirst aidofficer? Evaluation Quality criteria Validity Reliability Authenticity Recency

  15. Evaluation Definition Quality criteria Trends in evaluation: Dimensions Measuring Validity Aggregation level Valueing Reliability Functionsevaluation? Scoring Authenticity Who is responsible Recency Whentoevaluate? Evaluation techniques Fromhere on, we move to the thirdmainthemein thissessionaboutthe recent developments in evaluation. Five subthemes are discussed.

  16. Recent developments in evaluation Recent developments in evaluationcanbeclusteredalong five dimensions: • At whataggregation level is the evaluationbeing set up? • What are the functions/roles of the evaluation? • Whocarries out the evaluation? • When is the evaluationbeing set up? • Whatevaluationtechniques are beingadopted? We discusssomeexamples in relationtoeachdimension.

  17. Dimension 1: aggregation levels Evaluation Firstly, we observethatevolutionsin evaluation are relatedto the aggregation levels in oureducational frame of reference: • Micro level • Meso level • Macro level We look – in relationtoeachaggregation level – to particular new developments. Trends dimensions Aggregation level Functionsevaluation? Who is responsible Whentoevaluate? Evaluation techniques Microlevel Meso level Macro level

  18. Dimension 1: aggregation levels At eachaggregation level, the sameelements re-appear. Evaluationcanberelatedtoevery element in the educationalframe of reference • Responsiblefor the instruction • Learner • Learning activities • Organisation • Context • Instructionalactivities(objectives, learning content, media, didacticalstrategies, evaluation)

  19. Micro level Evaluation Example 1: evaluation of the extent the learningobjectives have been attained; Example 2: evaluation of didacticalstrategies. Trends dimensions Aggregation level Functionsevaluation? Who is responsible Whentoevaluate? Evaluation techniques Objectives Microlevel Didacticalstrategies

  20. Micro level: evaluationlearningobjectives Duringevaluation we measure the behavior, we valuethe behaviorandgive a score. The question is “What is the base of giving a certainvalue?”. • Based on a criterion? • Criterionreferenced assessment • Based on a norm, e.g., groupmean? • Norm referenced assessment • Based on earlier performance of learner? • Ipsative assessment orself-referenced assessment Objectives Microlevel Didacticalstrategies

  21. Micro level: evaluationlearningobjectives Example: athletics, 15-year olds have to run 100 meter? • Criterionreferenced assessment • Every performance is comparedtoan a priori statedcriterion; e.g., lessthan 15 seconds • Norm referenced assessment • Every performance is comparedto the classroom mean (imagineyour are in a class withfast runners). • Ipsative assessment of self-referenced assessment • Every performance is comparedtothe earlier performance of the individuallearner; emphasis on progress. Objectives Microlevel Didacticalstrategies

  22. Micro level: evaluationinstructionalstrategies Hattie (2009) discusses in his meta-analysis instructionalactivities. These analyses look whether different instructionalstrategies have a differential impact on learners. Do they matter? In the followingexampleyouseethat the didacticalstrategy “homework” has anaverage “effect size” d = .29. This is far below the benchmark d = .40. Objectives Microlevel Didacticalstrategies

  23. Meso level: evaluation at school level Evaluation Trends dimensions Aggregation level Functionsevaluation? Microlevel Who is responsible Meso level Whentoevaluate? Macro level Evaluation techniques

  24. Meso level: evaluation at school level • Recent developments at the school level look whether “schools” have a value-added; this means anadditionalvaluethatresults in betterlearning performance. • But can we simplycompare schools withoneanother? Does thisnot lead tosimple ranking as depicted in thisjournal Meso level Aggregation level

  25. Meso level: evaluation at school level • Onecannotsimplycompare schools. • Calder (1994) puts forward in this context, the CIPP model toconsidereverything in balance: • Context evaluation: the geographicalposition of a school, the available budget, the legal base, etc. • Input evaluation: what the school actuallyuses as resources, its program, itspolicies, the numberand type of staff members, etc. • Processevaluation: the way a program is implemented, the strategiesbeingused, the evaluation approach, the professional development of the staff, etc. • Product evaluation: the effects, such as goal attainment, throughput, return on investment, etc. Meso level Aggregation level

  26. Meso level: evaluation at school level • Comparing schools with the CIPP model can as suchimplythat: • A school with a lot of migrantsoutperforms a school withdominantlyupper class children. • A school canbegood in attainingcertain goals, but canbelessqualified in attainingother goals. • A school canbecriticized as toitspolicies. • Thatonewillconsider the geographicallocation of a school whendiscussingresults (e.g., anunsafeneighbourhood). • Thatwe willalsolook at what the learnersdo later whenthey go toanother school (e.g., success at university). • Schools are beingassessedby the inspection on the base of the CIPP model. Meso level Aggregation level

  27. Meso level: evaluation at school level • The inspectionreports are public.

  28. Macro level: school effectiveness Evaluation Trends dimensions Aggregation level Functionsevaluation? Microlevel Who is responsible Meso level Whentoevaluate? Macro level Evaluation techniques

  29. Macro level: school effectiveness Read the following description: • “The aim of school effectiveness research is todescribeandexplain the differencesbetween schools on the base of specific criteria. This research explores the differences in performance on the base of differences in thoseresponsiblefor teaching, the learners, the classes, the school.” Youcanseethat – as in the CIPP model – explanations are sought at the level of all schools in the educational system. Macro level Aggregation level

  30. Macro level: school effectiveness This development started from very critical reports as to the value-added of schools: • Coleman report (1966, chapter 1): “Schools have little effect on students’ achievement that is independent of their family background and social context.” • Plowden report (1967, p.35): “Differences between parents will explain more of the variation in children than differences between schools. (…) Parental factors, in fact, accounted for 58% of the variance in student achievement in this study.” • Schools want – in contrast to these reports – proof they make a difference and contribute to learner performance Macro level Aggregation level

  31. Macro level: school effectiveness A central critique on the Coleman and Plowden report is that they neglect the complex interplay that helps to explain differences between, schools; see the CIPP model. Instead of simply administering tests and comparing results, we have to look– next to “product effects” – to the processes and variables that are linked to these results. This is labelled with the concept performance indicators. Macro level Aggregation level

  32. Macro level: performance indicators • Performance indicators are: "statistical data, numbers, costsor anyotherinformationthatmeasuresandclarifiesthe outcomes of aninstitution in line withpreset goals.“ • You can notice that the emphasis in performance indicators is on the description and explanation of differences in performance. • One of the best known performance indicator studies is the three-yearly PISA study: Programme for International Student Assessment.E.g., in PISA 2006, performance was compared of schools in 54 countries. Macro level Aggregation level

  33. Macroniveau: performance indicators • Results of PISA in 2006 show – forexample – the high performance of Flemish schools forsciences, mathematics, and reading literacy. Macro level Aggregation level

  34. PISA results are notonlydescribed.They are alsoexplained. In thisgraphic, oneseeshow the PISA results are associatedwith the socio-economic status (SES) of the learners. The higherthe status, the higher the results. SES is determinedby the educational level of the parents, theirincome, theirpossession of culturalgoods (e.g., books), etc. Macro level Aggregation level

  35. Dimension2: Functions of evaluation Evaluation Definition Quality criteria Trends in evaluation: Dimensions Measuring Validity Aggregation level Valueing Reliability Functionsevaluation? Scoring Authenticity Who is responsible Recency Whentoevaluate? Evaluation techniques

  36. Dimension2: Functions of evaluation Why do we evaluate? Theremightbe different reasons: • Formativeevaluation • Toseewhereone is in the learningprocessandhow we canredirectthe learningprocess • Summativeevaluation • Todetermine the finalattainment of the goals. • Predictionfunction • Topredictfuture performance(e.g., success in highereducation) • Selectionfunction • Toseewhetherone is fit for a job or task. Functions Formative Summative Prediction Selection

  37. Dimension 2: Functions of evaluation Abroad, there is a lot of attention for the selectionfunction; see the emphasis on entranceexams. In thisexample, onesees a luckycandidate (and his mother) whosucceeded in the entranceexamfor a Chinese university. Functions Formative Summative Prediction Selection

  38. Dimension 2: Functions of evaluation Earlier, there was a major emphasis on summativeevaluation. Nowadaysthisemphasis has shiftedtowardsformativeevaluation. Why? • Does onelearnfromevaluative feedback; this is alsocalledconsequentialvalidity genoemd? • From the evaluationresults, does thisnotimplythat the teacher has toredirect the instruction, the support, the learningmaterials, etc? • Does a learneralreadyreach a preliminaryattainment level? Functions Formative Summative Prediction Selection

  39. Dimension 3: Who is responsible? Evaluation Definition Quality criteria Trends in evaluation: Dimensions Measuring Validity Aggregation level Valueing Reliability Functionsevaluation? Scoring Authenticity Who is responsible Recency Whentoevaluate? Evaluation techniques

  40. Dimension 3: Who is responsible? Traditionally, the teachers is responsiblefor the evaluation. But there are new developments: • The learnerhim/herselfcarries out the evaluation : self assessment • The learnerandpeerscarry out the evaluationtogether: peer assessment • An externalresponsiblecarries out the evaluation (e.g., other teacher). • An external company carries out the evaluation: assessment centers • … Self assessment Who Peer assessment Assessment center

  41. Dimension 3: Who is responsible? • New development: self assessment • Self-assessment is seen as a type of evaluationthataims at fostering the learningprocess (Assessment-as-learning) : formativeevaluationfunction • Twomain steps tobe taken: • Initialtrainingtodevelop criteria and tool, andtodiscuss the value of what is beingmeasured. • Next, usage of the tools/instrumentsanddeveloping a personal opinion.Scoring is notan issue here. • Veryusefultechnique: rubrics(seefurther) Self assessment Who Peer assessment Assessment center

  42. Dimension 3: Who is responsible? • Assessment centres: external company thatcarries out evaluation; mostlywithselectionfunction • “Standardizedprocedure toassess complex behavior on the base of multiple information bases. The behavior is assessed in simulatedcontexts. Multiple persons evaluateandcometo a shared vision.” • Differentevaluators are involvedandguarantee a 360° approach of the evaluation • Thistechniquefulfillsselectionfunctione.g., when screening candidatesfor a job Self assessment Who Peer assessment Assessment center

  43. Dimension 4: When to evaluate? Evaluation Definition Quality criteria Trends in evaluation: Dimensions Measuring Validity Aggregation level Valueing Reliability Functionsevaluation? Scoring Authenticity Who is responsible Recency Whentoevaluate? Evaluation techniques

  44. Dimension 4: When to evaluate? There is a shift in the moment the evaluation is being set up: towards « prior to » and « during » the learning process; serving a formativeevaluation function: • Prior • Prior knowledge testing • During • Progress testing • Portfolio evaluation • After • Final evaluation

  45. Dimension 5: What technique? Evaluation Definition Quality criteria Trends in evaluation: Dimensions Measuring Validity Aggregation level Valueing Reliability Functionsevaluation? Scoring Authenticity Who is responsible Recency Whentoevaluate? Evaluation techniques

  46. Dimension 5: What technique? Next to traditional evaluation tests with multiple choicequestions, open answerquestions, fill-in questions, sortquestions, … we observe a series of new techniques. Examples: • Rubrics: attention is paidto criteria and indicators • Portfolio’s: file with letters, information, illustrations, products, … as the information base for the evaluation • … Rubrics Evaluation techniques Portfolios

  47. Dimension 5: Technique rubrics Rubrics: • Defineclearcriteria: concrete element of a complex learningobjectivethat is beingmeasured, valuedandscored • Determineforeachcriterion a number of qualityindicators: indicators exemplify the level at which a certaincriterion is being met, answered, attained Rubrics Evaluation techniques Portfolios

  48. Dimension 5: Technique Rubrics Performance indicators Examplerubric: “mixingcolours” In next steps of the learningprocess, we canadd criteria and/or performance indicators to the rubric Criteria

  49. Dimension 5: Technique Rubrics Examplerubric: “Writing of a historical fiction story” Rubrics Evaluation techniques Portfolios

  50. Dimension 5: Technique Portfolio Read thisdescription of a portfolio: A portfolio is a file with letters, information, illustrations, products, … that is used as an information base for the evaluation. Rubrics Evaluation techniques Portfolios

More Related