1 / 42

Instrumental Conditioning: Motivational Mechanisms

Instrumental Conditioning: Motivational Mechanisms. Contingency-Shaped Behaviour. Uses three-term contingency Reinforcement schedule (e.g., FR10) imposes contingency Seen in non-humans and humans. Rule Governed Behaviour. Particularly in humans Behaviour can be varied and unpredictable

essien
Télécharger la présentation

Instrumental Conditioning: Motivational Mechanisms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Instrumental Conditioning: Motivational Mechanisms

  2. Contingency-Shaped Behaviour • Uses three-term contingency • Reinforcement schedule (e.g., FR10) imposes contingency • Seen in non-humans and humans

  3. Rule Governed Behaviour • Particularly in humans • Behaviour can be varied and unpredictable • Invent rules or use (in)appropriate rules across conditions (e.g., language) • Age-dependent, primary vs. secondary reinforcers, experience

  4. Role of Response in Operant Conditioning • Thorndike • Performance of response necessary • Tolman • Formation of expectation • McNamara, Long & Wike (1956) • Maze • Running rats or riding rats (cart) • Association what is needed

  5. Role of the Reinforcer • Is reinforcement necessary for operant conditioning? • Tolman & Honzik (1930) • Latent learning • Not necessary for learning • Necessary for performance

  6. Day 11 Results no food Average Errors no food until day 11 food Days

  7. Associative Structure in Instrumental Conditioning • Basic forms of association • S = stimulus, R = response, O = outcome • S-R • Thorndike, Law of Effect • Role of reinforcer: stamps in S-R association • No R-O association acquired

  8. Hull and Spence • Law of Effect, plus a classical conditioning process • Stimulus evokes response via Thorndike’s S-R association • Also, S-O association creates expectancy of reward • Two-process approach • Classical and instrumental are different

  9. One-Process or Two-Processes? • Are instrumental and classical the same (one process) or different (two processes)? • Omission control procedure • US presentation depends on non-occurrence of CR • No CR, then CS ---> US • CR, then CS ---> no US

  10. Trial with a CR CS US CR Trial without a CR CS US CR Omission Control

  11. Gormenzano & Coleman (1973) • Eyeblink with rabbits • US=shock, CS=tone • Classical group: 5mA shock each trial, regardless of response • Omission group: making eyeblink CR to CS prevents delivery of US

  12. One-process prediction: • CR acquisition faster and stronger for Omission group • Reinforcement for CR is shock avoidance • In Classical group CR will be present because it somehow reduces shock aversiveness • BUT… • CR acquisition slower in Omission group • Classical conditioning extinction (not all CSs followed by US) • Supports Two-process theory

  13. Classical in Instrumental • Classical conditioning process provides motivation • Stimulus substitution • S acquires properties of O • rg = fractional anticipatory goal response • Response leads to feedback • sg = sensory feedback • rg-sg constitutes expectancy of reward

  14. rg - sg Timecourse S R O Through stimulus substitution S elicits rg-sg, giving motivational expectation of reward

  15. Prediction • According to rg-sg CR should occur before operant response; but doesn’t always • Dog lever pressing on FR33 ---> PRP • Low lever presses early, then higher; but salivation only later Lever pressing Magnitude salivation Time from start of trial

  16. Modern Two-Process Theory • Classical conditioning in instrumental • Neutral stimulus ---> elicits motivation • Central Emotional State (CES) • CES is a characteristic of the nervous system (“mood”) • CES won’t produce only one response • Bit annoying re: prediction of effect

  17. Prediction • Rate of operant response modified by presentation of CS • CES develops to motivate operant response • CS from classical conditioning also elicits CES • Therefore, giving CS during instrumental conditioning should alter CES that motivates instrumental response

  18. US CS Appetitive Aversive (e.g., food) (e.g., shock) CS+ Hope Fear CS- Disappointment Relief “Explicit” Predictions • Emotional states

  19. Behavioural predictions Aversive US Instrumental schedule CS+(fear) CS-(relief) Positive reinforcement decrease increase Negative reinforcement increase decrease

  20. R-O and S(R-O) • Earlier interpretations had no response-reinforcement associations • Intuitive explanation, though • Perform response to get reinforcer

  21. Colwill & Rescorla (1986) • R-O association • Devalue reinforcer post-conditioning • Does operant response decrease? • Bar push right or left for different reinforcers • Food or sucrose Testing of Reinforcers normal reinforcer Mean responses/min. devalued reinforcer Blocks of Ext. Trials

  22. Interpretation • Can’t be S-R • No reinforcer in this model • Can’t be S-O • Two responses, same stimuli (the bar), but only one response affected • Conclusion • Each response associated with its own reinforcer • R-O association

  23. Hierarchical S-(R-O) • R-O model lacks stimulus component • Stimulus required to activate association • Really, Skinner’s (1938) three term contingency • Old idea; recent empirical testing

  24. Colwill & Delameter (1995) • Rats trained on pairs of S+ • Biconditional discrimination problem • Two stimuli • Two responses • One reinforcer • Match the correct response to the stimuli to be reinforced • Training, reinforcer devaluation, testing

  25. Training • Tone: lever --> food; chain --> nothing • Noise: chain --> food; lever --> nothing • Light: poke --> sucrose; handle --> nothing • Flash: handle --> sucrose; poke --> nothing • Aversion conditioning • Testing: marked reduction in previously reinforced response • Tone: lever press vs. chain • Noise: chain vs. lever • Light: poke vs. handle • Flash: handle vs. poke

  26. Analysis • Can’t be S-O • Each stimulus associated with same reinforcer • Can’t be R-O • Each response reinforced with same outcome • Can’t be S-R • Due to devaluation of outcome • Each S activates a corresponding R-O association

  27. Reinforcer Prediction, A Priori • Simple definition • A stimulus that increases the future probability of a behaviour • Circular explanation • Would be nice if we could predict beforehand

  28. Need Reduction Approach • Primary reinforcers reduce biological needs • Biological needs: e.g., food, water • Not biological needs: e.g., sex, saccharin • Undetectable biological needs: e.g., trace elements, vitamins

  29. Drive Reduction • Clark Hull • Homeostasis • Drive systems • Strong stimuli aversive • Reduction in stimulation is reinforcer • Drive is reduced • Problems • Objective measurement of stimulus intensity • Where stimulation doesn’t change or increases!

  30. Trans-situationality • A stimulus that is a reinforcer in one situation will be a reinforcer in others • Subsets of behaviour • Reinforcing behaviours • Reinforcable behaviours • Often works with primary reinforcers • Problems with other stimuli

  31. Primary and Incentive Motivation • Where does motivation to respond come from? • Primary: biological drive state • Incentive: from reinforcer itself

  32. But… Consider: • What if we treat a reinforcer not as a stimulus or an event, but as a behaviour in and of itself • Fred Sheffield (1950s) • Consummatory-response theory • E.g., not the food, but the eating of food that is the reinforcer • E.g., saccharin has no nutritional value, can’t reduce drive, but is reinforcing due to its consumability

  33. Premack’s Principle • Reinforcing responses occur more than the responses they reinforce • H = high probability behaviour • L = low probability behaviour • If L ---> H, then H reinforces L • But, if H ---> L, H does not reinforce L • “Differential probability principle” • No fundamental distinction between reinforcers and operant responses

  34. Premack (1965) • Two alternatives • Eat candy, play pinball • Phase I: determine individual behaviour probability (baseline) • Gr1: pinball (operant) to eat (reinforcer) • Gr2: eating candy (operant) to play pinball (reinforcer) • Phase II (testing) • T1: play pinball (operant) to eat (reinforcer) • Only Gr1 kids increased operant • T2: eat (operant) to play pinball (reinforcer) • Only Gr2 kids increased operant

  35. Premack in Brief Any activity… …could be a reinforcer … if it is more likely to be “preferred” than the operant response.

  36. Response Deprivation Hypothesis • Restriction to reinforcer response • Theory: • Impose response deprivation • Now, low probability responses can reinforce high probability responses • Instrumental procedures withhold reinforcer until response made; in essence, deprived of access to reinforcer • Reinforcer produced by operant contingency itself

  37. Behavioural Regulation • Physiological homeostasis • Analogous process in behavioural regulation • Preferred/optimal distribution of activities • Stressors move organism away from optimum behavioural state • Respond in ways to return to ideal state

  38. Behavioural Bliss Point • Unconstrained condition: distribute activities in a way that is preferred • Behavioural bliss point (BBP) • Relative frequency of all behaviours in unconstrained condition • Across conditions • BBP shifts • Within condition • BBP stable across time

  39. Imposing a Contingency • Puts pressure on BBP • Act to defend challenges to BBP • But requirements of contingency (may) make achieving BBP impossible • Compromise required • Redistribute responses so as to get as close to BBP as possible

  40. Minimum Deviation Model • Behavioural regulation • Due to imposed contingency: • Redistribute behaviour • Minimize deviation of responses from BBP • Get as close as you can

  41. restricted running 40 30 20 10 Time drinking restricted drinking 10 20 30 40 Time running

  42. Strengths of BBP Theory • Reinforcers: not special stimuli or responses • No difference between operant and reinforcer • Explains new allocation of behaviour • Fits with findings on cognition for cost:benefit optimization

More Related