Model Selection and Occam's Razor: A Two-Edged Sword Christopher S. Campbell, Michael M. Cohen,

FLMP LIM WTAV SET 1 SET 2 SET 3 SET 4 SET 5 Figure 2. Noiseless response patterns generated from five different sets of parameters with three models: FLMP, LIM, and WTAV. Model Selection and Occam's Razor: A Two-Edged Sword Christopher S. Campbell, Michael M. Cohen, Tony Rodriguez, and Dominic W. Massaro Perceptual Science Laboratory University of California, Santa Cruz 3. Model Selection with Noisy Data The robustness of model selection methods was evaluated under conditions of increasing noise for the 5 x 5 expanded factorial design. Six levels of Gaussian noise were added to the outcomes. Table 1. Summary of marginal likelihood approximations and percentage of model wins. Note: Highlighted cells indicate that the model gave a significantly better fit of the simulated results (p < .05). Abstract A Bayes method was proposed as an improved type of model selection (Myung & Pitt, 1997) because it putatively penalizes complex models (such as the FLMP) that presumably capture idiosyncratic variance in the data. By penalizing these complex models, simpler models (such as the LIM) could be compared to complex models on an even playing field. When tested for model recovery, however, the Bayes method was no better than goodness-of-fit (RMSD) for our prototypical design (expanded factorial). The Bayes method also showed a bias toward the simpler model with noisy data . We concluded that the RMSD provides an good measure of model performance without fairly expensive computational resources. • Method • Each set of data was generated from 10 parameters (2 + 8) except the weighted-averaging model with an additional weight parameter (.2426). Figure 2 shows the five sets of response patterns for the three models. • Monte Carlo simulation was used to generate 100 subjects assuming 20 independent trials for each of (2 x 8) = 16 conditions. • The FLMP, LIM, and WTAV models were fit to the five simulated data sets using both RMSD and the Bayes method. Bayes method was calculated with equal prior probabilities and evaluated with 500,000 iterations. Model Data Table 2. Summary of RMSD and percentage of model wins. Note: Highlighted cells indicate that the model gave a significantly better fit of the simulated results (p < .05). . Bayes Method Bayesian Model Selection attempts to take into account model complexity when evaluating the appropriateness of a model. Complex models, because they are sensitive to small changes in parameter values, are penalized in favor a simpler model. As shown in Figure 1, the Bayes method accomplishes this by evaluating the marginal likelihood for a given model across all possible parameters. Although a simpler model might fit the data better at a narrow range of parameter values, the Bayes method strives to achieve an overall goodness-of-fit averaged over all possible parameter values. In Figure 1, the cumulative likelihood function is greater for the simple model than the complex model and is, therefore, determined to be the better model. This overall goodness of fit is calculated as the marginal likelihood which is defined as: Model Data Figure 3. Log marginal likelihood (top panel) and RMSD (bottom panel) as a function of seven levels of Gaussian Noise. 2. Extension to 5 x 5 Expanded Factorial Design We evaluated RMSD and Bayes methods of model selection for our prototypical design, namely, the symmetrical expanded factorial design, which has been used in many recent experiments. • Results • Figure 3 shows that both methods gave poorer model fits with increasing noise. • Contrary to previous claims, the FLMP does not have a magical ability to describe noisy results. • In fact, the Bayes method appears to be biased for the simple model as the data sets became more noisy. • Method • One set of data was calculated from 10 parameters (5 + 5) taken from the average parameter values of 82 subjects (Massaro, 1998). • Monte Carlo simulation was used to generate 100 simulated subjects for the average subject assuming 24 independent trials for each of 35 conditions. • The FLMP and WTAV models were fit to the simulated data set using both RMSD and Bayes method. Bayes method was calculated with equal prior probabilities and evaluated with 500,000 iterations. Conclusion We extended the Myung and Pitt (1997) comparison of Bayesian Model Selection (BMS) and root mean square deviation (RMSD) methods of model selection. Their analysis failed to take into account 1) the overall goodness of fit as opposed to simply which model gave the better description of the results, 2) whether a model gave a statistically better fit for a group of participants than another model, and 3) generating data sets from different models produces very different data configurations and therefore are not directly comparable. Our re-evaluation of BMS and RMSD methods in symmetrical expanded factorial designs refuted the conclusions of Myung and Pitt (1997) by showing 1) neither BMS nor RMSD is an infallible method of model selection, 2) RMSD and BMS both show that WTAV fits FLMP data about as well as FLMP fits WTAV data, 3) BMS results in a bias for less complex models in the presence of noise while RMSD does not. We concluded that RMSD provides nearly as good measure of fit as BMS without requiring unsubstantiated assumptions in model testing or fairly expensive computational resources. • Results • Results agreed with Myung and Pitt (1997) by showing better model recovery using Bayes method than RMSD for the 2 x 8 factorial design (Bayes method achieving 10 hits and 2 false alarms vs. RMSD with 4 hits and 2 false alarms). However, the results were highly dependent on the parameter values chosen. The Bayes method favored the WTAV for set 4 data and favored the FLMP for set 5 data. In addition, because similar parameter values can yield very different response patterns of different models, it is not always prudent to compare models that utilize the same parameter values. Instead, parameter values obtained from actual experiments should be used because these would provide more comparable response patterns across the different models. • Extended results by showing that data sets can be created so that any model can fit the data generated by another model. Table 3. Summary of RMSD, marginal likelihood approximations and percentage of model wins. Note: Highlighted cells indicate that the model gave a significantly better fit of the simulated results (p < .05). Figure 1. Maximum likelihood functions for complex and simple models. Taken from Myung and Pitt (1997, Figure 4). 1. Replication and Extension The Bayes method was used to extend the analysis of Myung and Pitt (1997) using the same 2 x 8 factorial design and the same three data sets. We tested the linear model (LIM) and the fuzzy logical model (FLMP) and replaced the TSD model with a weighted-averaging model (WTAV) because previous experiments contrasted the FLMP with the WTAV. Two additional data sets (Set 4 and Set 5) were chosen to provide a broader range of response patterns. • Results • Both the Bayes method and RMSD method recover the correct model for the symmetrical, expanded factorial design. (see Table 3). • In general, the RMSD method is comparable to Bayes method for powerful designs (symmetrical and expanded factorial) Everything should be as simple as possible, but not simpler. - Albert Einstein

Model Selection and Occam's Razor: A Two-Edged Sword Christopher S. Campbell, Michael M. Cohen,

Model Selection and Occam's Razor: A Two-Edged Sword Christopher S. Campbell, Michael M. Cohen,

Presentation Transcript

The Impact of Terrorism on Political Attitudes: A Two-Edged Sword

by Christopher J. Paradise, Laurie J. Heyer and A. Malcolm Campbell

The Double-Edged Sword of Plan Design: Cash Balance Plans 2009 ASPPA Annual Conference

Theory and Research

6.1.4 AIC, Model Selection, and the Correct Model

South African Seasonal Rainfall Prediction Performance by a Coupled Ocean-Atmosphere Model

Counting the Chickens While They Hatch and the Double Edged Sword of Tracking Project Claims and Delays

Hebrews 4:11-12

2 Timothy 3:15-17

Occam’s Razor

Bayesian Optimization Algorithm, Decision Graphs, and Occam’s Razor

Occam A Concurrent Programming Language

Link Budget

Variational Bayes Model Selection for Mixture Distribution

A Syntactic Justification of Occam’s Razor

pony – The occam- π Network Environment A Unified Model for Inter- and Intra-processor Concurrency

Drinkability of Beer

Perceiving Talking Faces: A Paradigm for Multimodal Communication

1) Phaedra Akers 2) 3) Carrington Caise 4) Damien Campbell 5) Richana Campbell

Model Selection for SVM & Our intent works

double edged sword:

Model Selection and Occam's Razor: A Two-Edged Sword Christopher S. Campbell, Michael M. Cohen,