Prediction and Change Detection Study in Non-Stationary Data Sequences

Prediction and Change Detection Mark Steyvers Scott Brown Mike Yi University of California, Irvine This work is supported by a grant from the US Air Force Office of Scientific Research (AFOSR grant number FA9550-04-1-0317)

Overview • Prediction in non-stationary time-series data • statistical properties changing over time • Example: stock market, traffic, weather • Accurate prediction requires detection of change • How well can people predict future outcomes? • What are the individual differences?

Previous Work • Much work on perception of stationary random sequences: • Gambler’s fallacy • Hot hand belief (e.g. Gilovich et al.) • Shows how people often perceive changes in arguably stationary sequences: overfitting

Our Approach • Non-stationary random sequences: • Distribution changes over time at random points • Allows for perception of • too little structure: underfitting • too much structure: overfitting

Basic Task • Given a sequence of random numbers, predict the next one

Experiment 1 • Where next blue square will arrive on right side? 1 2 3 4 5 12 Possible Locations 6 7 8 9 10 11 12

Experiment 1 • 15 blocks of 100 trials • 21 subjects: all get same sequence • Window shows history of 30 trials • Each trial is subject initiated • Points are given for correct or near-to-correct predictions.

Sequence Generation • Locations are drawn from a binomial distribution of size 11, with probability of success θ drawn from [0,1]. • Each time step carries a 10% chance that θ will be changed to a new random value in [0,1] • Example sequence: Time θ=.12 θ=.95 θ=.46 θ=.42 θ=.92 θ=.36

Optimal Strategy • Optimal strategy: detect change points for θ then identify the mode within each section • Bayesian model formalizes this strategy(Steyvers & Brown, NIPS, in press)

= observed sequence Optimal Bayesian Solution = prediction Subject 4 – change detection too slow Subject 12 – change detection too fast (sequence from block 5)

Tradeoffs • Detecting the change too slowly will result in lower accuracy and less variability in predictions than an optimal observer. • Detecting the change very quickly will result in false detections, leading to lower accuracy and higher variability in predictions.

Average Error vs. Movement Relatively few changes = subject Relatively many changes

A simple model • Make new prediction some fraction αof the way between old prediction and recent outcome. • Fraction α is a linear function of the error made on last trial • Two free parameters: A, B A<B bigger jumps with higher error A=B constant smoothing 1 B α A B A 0

Sweeping the parameter space = subject = model

Best fitting parameters for individual subjects 1 A=B α α ≈ constant (bad strategy– no jumps) A<B 0 A parameter Jumps with large errors: good strategy B parameter

Effect of A and B parameters = subject = model A ≈ B = model A << B

Model misses some trends in data… False perception of motion: if successive blocks go up, then extrapolate the trend = observed sequence = prediction (subject 12, block 3)

Experiment 2: two-dimensional prediction • Touch screen monitor • 1500 trials • Self-paced • Same sequence for all subjects = observed data = prediction

Average Error vs. Movement = subject

Average Error vs. Movement = subject = model

Conclusion • Individual differences • Overfitters: hypotheses too complex • Underfitters: hypotheses too simple • Relation to perception of real-world phenomena? • Relation to personality characteristics?

Best fitting parameters for individual subjects 1 A=B α A<B 0

Responses across subjects = observed sequence = #subjects with that prediction (sequence from block 5)

Prediction and Change Detection Study in Non-Stationary Data Sequences