Role of Norepinephrine in Learning and Plasticity (Part II) a Computational Approach

Role of Norepinephrine in Learning and Plasticity (Part II)a Computational Approach • Valance Wang • TNU, ETH Zurich

Role of NE in probabilistic inference (Yu and Dayan, 2005) • Probabilistic inference approach • Pupil dilation as an indicator of phasic NE activity (Preuschoff et al, 2011) • Statistic approach • Neural gain, attentional modulation and probabilistic learning (Eldar et al, 2013) • Neural network approach

The weather forecast predicts today is cold and rainy. However, today is hot and sunny. The forecast is wrong. Why the forecast is wrong? • Due to inevitable stochasticity in weather, i.e. a low probability event occurs. • Due to onset of El Nino, i.e. the assumed context is wrong.

The weather forecast predicts today is cold and rainy. However, today is hot and sunny. The forecast is wrong. Why the forecast is wrong? • Due to inevitable stochasticity in weather, i.e. a low probability event occurs. • Due to onset of El Nino, i.e. the assumed context is wrong. • How this question is related to a probabilistic learning framework? Given the framework, how to answer this question?

A Hidden Markov Model Cue Identity Level Cue-Target Level Yu and Dayan, 2005

The A Hidden Markov Model Cue Identity Level Cue-Target Level Yu and Dayan, 2005

Cue Identity Level Cue-Target Level Remark: This task is more general than reversal learning. In reversal learning, either Cue 1 or Cue 2 signals the target, these two are dependent. Yu and Dayan, 2005

When the subject gets an error feedback, shall it due to a low probability event, or shall it because the cue identity has changed? • How to solve this task? • The exact solution - ideal learner algorithm • Iterative update • Remark: identical to forward belief propagation in HMM • But computationally and representationally expensive to solve Yu and Dayan, 2005

But animals/humans can solve it! • We use heuristics: • Representation: • In natural environments, contexts tend to persist over time. Thus we maintain only one or a few working hypothesis at a given time. • Computation: • Ach and NE signals statistical irregularity Yu and Dayan, 2005

Uncertainty about the behavioral context should • suppress the use of assumed cues for making inferences (top-down) • but boost learning about the lesser known predictive relationships (bottom-up) • Evidence: • Across primary sensory cortex, Ach and NE selectively suppresses intracortical and feedback synaptic transmission, while sparing or boosting thalamo-cortical processing • Ach and NE plays a role in experience-dependent plasticity in the neocortex and the hippocampus • Ach and NE depletion suppresses experience-dependent plasticity • Experimental increase of Ach and NE induces cortical re-organization when paired with sensory stimulation Yu and Dayan, 2005

Forms of uncertainty: • Expected uncertainty: due to low probability events, e.g. natural stochasticity in weather • Ach • Unexpected uncertainty: due to gross change in the environments that strongly violating top-down expectations, e.g. El Nino • NE Yu and Dayan, 2005

Ach • Probabilistic cueing paradigm: P(Target|Cue) = Bernoulli(γ) • Validity effect (no. valid trials - no. invalid trials) varies inversely with the level of Ach • in rodents and primates with pharmacological and surgical manipulations of Ach release • in Alzheimer’s patients with characteristic cholinergic depletion • in smokers after nicotine (Ach) use Yu and Dayan, 2005

NE • Attention-shift paradigm: • cues that indicate which route to take suddenly changes from spatial cues to visual cues • In rats’ maze navigation, boosting NE with drug idazoxan accelerates the detection the change in cue-target relationship and learning of the new cues • Cortical noradrenergic (but not cholinergic) lesions impair the shift of attention from one type of cue to another Yu and Dayan, 2005

The exact solution - ideal learner algorithm • The approximate solution • Infer only the most likely cue identity • Reduces the computation to only ~ 3 variables • Operations: addition, multiplication Yu and Dayan, 2005

Yu and Dayan, 2005

Interaction between Ach and NE: • The context should be assumed to have changed if • If the cue invalidity is low, then a single mismatched cue-target sample signals context change. If the cue invalidity is high, then more mismatched samples are needed to signal context change NE Ach

Modeling of Pharmacological Manipulation • Probabilistic cueing paradigm (Ach) nicotine +Ach scopolamine -Ach experiment model

Modeling of Pharmacological Manipulation • Attention-shift paradigm (NE) experiment model

Role of NE in probabilistic inference (Yu and Dayan, 2005) • Probabilistic inference approach • Ach signals expected uncertainty, NE signals unexpected uncertainty • Pupil dilation as an indicator of phasic NE activity (Preuschoff et al, 2011) • Statistic approach • Neural gain, attentional modulation and probabilistic learning (Eldar et al, 2013) • Neural network approach

Why poker players wear sunglasses during the game? • To prevent opponents reading their mind. In particular, they need to hide their pupils. • What the pupil dilation (phasic response) has to say about his cards?

10 6 8 9 3 5 4 2 7 1 • Auditory gambling task • Bet: which card should be higher? • Sampling card 1 and card 2, without replacement • Your first card is constant low illumination

10 6 8 9 3 5 4 2 7 1 • Auditory gambling task • Bet: which card should be higher? • Sampling card 1 and card 2, without replacement • You bet on the first card. Your first card is 10, what is the probability that you will win? constant low illumination

Some concepts

Bet First card Second card • To dissociate unexpected uncertainty (risk prediction error) from expected uncertainty (risk): • “Your first card is 8.” • The subject perceives a low risk (low variance). • “Your second card is 10.” • The outcome is now settled. • The subject perceives that this result is surprising (deviation from expected risk).

Bet First card Second card • Statistical model: pointwise t-test

Bet First card Second card • Statistical model: • Multiple linear regression • y is pupil dilation • x1 is probability of winning • x2 is risk

Bet First card Second card

Role of NE in probabilistic inference (Yu and Dayan, 2005) • Probabilistic inference approach • Ach signals expected uncertainty, NE signals unexpected uncertainty • Pupil dilation as an indicator of phasic NE activity (Preuschoff et al, 2011) • Statistic approach • Pupil dilation is anti-correlated with risk and correlated with risk prediction error • Neural gain, attentional modulation and probabilistic learning (Eldar et al, 2013) • Neural network approach

Probabilistic inference • Prior predisposition: • e.g. learning style: some people prefer to attend to concrete visual details, while others may attend to abstract semantic concepts • measured by Index of Learning Style questionnaire • Attentional modulation is mediated by neural gain • High neural gain focuses attention and learning on the dimension the one is predisposed to attend • Low gain broadens attention

Probabilistic learning task • Instruction: stimulus has some property to predict the reward • Unknown to subjects: • Stimulus feature [x1 x2]’ • Visual feature ( x1 ): bright background, gray image, etc • Semantic feature ( x2 ): food, sea-related, etc • 18 games. 1 Game = 5 trials rewarding visual feature + 5 trials rewarding semantic feature

Neural gain parameterizes neural activity • Single neuron • Effect of high gain: binary-like activation • Multi-layer perceptron • Three layers • Mutually inhibitive (winner-take-all topology) • Prior predisposition: as biased input weight • Effect of high gain: winner-take-all

Neural gain parameterizes neural activity • Recurrent neural network • 1000 nodes, all-to-all connection, uniformly random weights [-0.01,0.01], • Effect of high gain: • high positive and negative correlation values • High functional clustering

Neural gain is indexed by baseline (tonic) pupil diameter • High baseline pupil diameter is associated with more extreme fMRI BOLD signals • Baseline pupil diameter and neural functional clustering • High baseline pupil diameter indicates high gain, thus results in high clustering

Prior predisposition contributes to biased task performance (linear regression) • High neural gain also contributes to biased task performance (black dots)

Role of NE in probabilistic inference (Yu and Dayan, 2005) • Probabilistic inference approach • Ach signals expected uncertainty, NE signals unexpected uncertainty • Pupil dilation as an indicator of phasic NE activity (Preuschoff et al, 2011) • Statistic approach • Pupil dilation is anti-correlated with risk and correlated with risk prediction error • Neural gain, attentional modulation and probabilistic learning (Eldar et al, 2013) • Neural network approach • Neural gain parameterizes clustered neural activity. High gain (as indexed by baseline pupil diameter) correlates with high clustering. Both prior and neural gain contributes to biased task performance.

Thank you!

The approximate solution Yu and Dayan, 2005

The weather forecast predicts today is cold and rainy. However, today is hot and sunny. The forecast is wrong. How shall we infer why the forecast is wrong? • Due to inevitable stochasticity in weather, i.e. a low probability event occurs. • Due to onset of El Nino, i.e. the assumed model is wrong. Model B Model A Model B Model A Event Event Structure level Parameter level

Role of Norepinephrine in Learning and Plasticity (Part II) a Computational Approach