1 / 19

Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation

Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation. Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark International Inc. The Problem:. When evaluating several print samples, pair-wise comparison experiments are often used.

varian
Télécharger la présentation

Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark International Inc.

  2. The Problem: When evaluating several print samples, pair-wise comparison experiments are often used. Two print samples at a time are judged by a human subject to determine which print sample is “better”. This is repeated with different pairs and different subjects. The resulting data will look like: / 5 4 37 6 / / 7 45 28 / / / 46 40 / / / / 4

  3. How to Summarize the data; Order the print samples in terms of “strength”; Margin of error in the analysis/conclusion. Predict the outcome of future comparisons.

  4. Outline of talk Introduction to Thurstone/Mosteller Model • New model, theoretical formulation Var-Cor modeling, Maximum Likelihood Estimation, Likelihood ratio confidence interval • New model, application to experimental data • Comparisons with classical model, how good is the fit? • Discussion

  5. For pairwise comparisons of stimuli i and k, the observable outcomes are the signs of and the outcomes from different pairs are independent. (but within the pair, they may or may not be independent). Assume Where N( , ) denotes the normal distribution.

  6. If we observed the outcomes of many pairs, the log likelihood function is where And is the cdf of the standard normal distribution (available in many software packages).

  7. Where W (or L) is the times stimulus i is deemed better (or worse) than stimulus k in the pair-wise comparisons. The classical model assumes The new model we propose assumes for the variances

  8. Because the human perceptual process is highly adaptive and is at its best when used as a null tester, ie, more sensitive for closely matched stimuli. • Thus the variances should be related to how closely the strengths are matched. e.g.

  9. Computation Use software Splus (commercial) or R (Gnu) or Mathcad (commercial) or Matlab (commercial) or SAS (commerical) 1. Define the log likelihood function llk() as a function of the parameters.

  10. 2. Maximize the llk() or minimize the negative of llk() by using the optimization functions supplied. In R the optimize functions are: nlm( ) optim( ) In SAS iml we could use function nlptr( )

  11. The parameter values that achieve the maximization (max1) are the estimate of the parameters. • Confidence interval of the parameter can be obtained by temporarily fix the value of the parameter at and maximize over the remaining parameters. Suppose it achieved maximum value max2. • those values for which max1 – max2 < 3.84/2 is the 95% confidence interval for the parameter.

  12. Example: Colorfulness data • Nine print samples were compared. • Pairwise experiment, 50 subjects

  13. Models fitted are: 1. Classic model with equal variances. 2. New model

  14. Models fitted are: 2. New model

  15. Differences: (predicted – observed) Model 1

  16. Differences: (predicted – observed) Model 2 with one more para.

  17. Differences: (predicted – observed) Model 1 vs 2

  18. We also fit Bradley-Terry model to the data (use SAS) and the fit is similar to the classic model.

  19. References Acknowledgements: We would like to thank Dr. Shaun Love at Lexmark International Inc. for helpful discussions. • 1. Peter, G. Engeldrum, Psychometric scaling, A toolkit for imaging system development, Imcotek press. (2000) • 2. Torgerson, W.S. Theory and methods of scaling, John Wiley & Sons, Inc. (1958) • 3. Bradley, R.A. and Terry, M. E. "Rank analysis of incomplete block design. I. The method of paired comparisons." Biometrika 39, 324-345. (1952) • 4. P. Hall and B. La Scala, Methodology and algorithms of empirical likelihood, International Statistical Review, 58, 109-127. (1990) R: http://cran.us.r-project.org Updated manuscript: http://www.ms.uky.edu/~mai/research/

More Related