1 / 30

Attendee questionnaire

Attendee questionnaire. Name Affiliation/status Area of study/research For each of these subjects: Linguistics (Optimality Theory) Computation (connectionism/neural networks) Philosophy (symbolic/connectionist debate) Psychology (infant phonology) please indicate your relative level of

aline-walsh
Télécharger la présentation

Attendee questionnaire

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Attendee questionnaire • Name • Affiliation/status • Area of study/research • For each of these subjects: • Linguistics (Optimality Theory) • Computation (connectionism/neural networks) • Philosophy (symbolic/connectionist debate) • Psychology (infant phonology) please indicate your relative level of • interest (for these lectures) [1 = least, 5 = most] • background [1 = none, 5 = expert] Thank you

  2. Optimality in Cognition and Grammar Paul Smolensky Cognitive Science Department, Johns Hopkins University Plan of lectures • Cognitive architecture • Symbols and neurons • Symbols in neural networks • Optimization in neural networks • Optimization in grammar I: HG  OT • Optimization in grammar II: OT • OT in neural networks

  3. Cognitive architecture • Central dogma of cognitive science: Cognition is computation • But what type of computation? • What exactly is computation, and what work must it do in cognitive science?

  4. Computation • Functions, cognitive • Pixels  objects  locations [low- to high-level vision] • Sound stream  word string [phonetics + …] • Word string  parse tree [syntax] • Underlying form  surface form [phonology] • petit copain: /pətit + kopɛ̃/  [pə.ti.ko.pɛ̃] • petit ami: /pətit + ami/  [pə.ti.ta.mi] • Reduction of complex procedures for evaluating functions to combinations of primitive operations • Computational architecture: • Operations: primitives + combinators • Data

  5. Symbolic Computation • Computational architecture: • Operations: primitives + combinators • Data • The Pure Symbolic Architecture (PSA) • Data: strings, (binary) trees, graphs, … • Operations • Primitives • Concatenate (string, tree) = cons • First-member(string); left-subtree(tree) = ex0 • Combinators • Composition: f(x)=def g(h(x))) • IF(x = A) THEN … ELSE …

  6. Aux V by  Passive LF V A P P A ƒPassive Few leaders are admired by George  admire(George, few leaders) ƒ(s) = cons(ex1(ex0(ex1(s))),cons(ex1(ex1(ex1(s))), ex0(s))) • But for cognition, need a reduction to a very different computational architecture

  7. The cognitive architecture: The connectionist hypothesis • Representations: Distributed activation patterns • Primitive operations (e.g.) • Multiplication of activations by synaptic weights • Summation of weighted activation values • Non-linear transfer functions At the lowest computational level of the mind/brain PDP Computation • Combination: Massive parallelism

  8. Criticism of PDP (e.g., neuroscientists) • Much too simple • Misguided. Relevant complaint: • Much too complex • Target of computational reduction must be within the scope of neural computation. • Confusion between two questions

  9. The cognitive questionfor neuroscience What is the function of each component of the nervous system? Our question is quite different.

  10. The neural question for cognitive science How are complex cognitive functions computed by a mass of numerical processors like neurons—each very simple, slow, and imprecise relative to the components that have traditionally been used to construct powerful, general-purpose computational systems? How does the structure arise that enables such a medium to achieve cognitive computation?

  11. The ICS Hypothesis The Integrated Connectionist/Symbolic Cognitive Architecture (ICS) • In higher cognitive domains, representations and fuctions are well approximated by symbolic computation • The Connectionist Hypothesis is correct • Thus, cognitive theory must supply a computational reduction of symbolic functions to PDP computation

  12. F G E B Agent Patient D C Output Input B D C Aux F by Patient E G W Agent PassiveNet

  13. F Aux V by G B Passive LF Patient D C Output V P A P A Input B D C Aux F by Patient E G W Agent The ICS Isomorphism Tensor product representations Tensorial networks 

  14. Within-level compositionality W = Wcons0[Wex1Wex0Wex1] +Wcons1[Wcons0(Wex1Wex1Wex1)+Wcons1(Wex0)] ƒ(s) = cons(ex1(ex0(ex1(s))),cons(ex1(ex1(ex1(s))),ex0(s))) Between-level reduction

  15. Levels

  16. ƒ G “dogs” σ σ σ k     k k æ t æ æ t t Processing (Learning) The ICS Architecture dog+s dgz A

  17. Processing I: Activation • Computational neuroscience • Key sources • Hopfield 1982, 1984 • Cohen and Grossberg 1983 • Hinton and Sejnowski 1983, 1986 • Smolensky 1983, 1986 • Geman and Geman 1984 • Golden 1986, 1988

  18. –λ (–0.9) a1 a2 i1 (0.6) i2 (0.5) Processing I: Activation Processing — spreading activation — is optimization: Harmony maximization

  19. ƒ G σ σ σ k     k k æ t æ æ t t The ICS Architecture cat kæt A

  20. –λ (–0.9) a1 a2 i1 (0.6) i2 (0.5) Processing II: Optimization • Cognitive psychology • Key sources: • Hinton & Anderson 1981 • Rumelhart, McClelland, & the PDP Group 1986 Processing — spreading activation — is optimization: Harmony maximization

  21. a1 and a2must not be simultaneously active (strength: λ) Harmony maximization is satisfaction of parallel, violable well-formedness constraints –λ (–0.9) a1 a2 a1must be active (strength: 0.6) a2must be active (strength: 0.5) CONFLICT i1 (0.6) i2 (0.5) Optimal compromise: 0.79 –0.21 Processing II: Optimization Processing — spreading activation — is optimization: Harmony maximization

  22. Processing II: Optimization • The search for an optimal state can employ randomness • Equations for units’ activation values have random terms • pr(a) ∝eH(a)/T • T (‘temperature’) ~ randomness  0 during search • Boltzmann Machine (Hinton and Sejnowski 1983, 1986); Harmony Theory (Smolensky 1983, 1986)

  23. ƒ G σ σ σ k     k k æ t æ æ t t The ICS Architecture cat kæt A

  24. Two Fundamental Questions Harmony maximization is satisfaction of parallel, violable constraints 2. What are the constraints? Knowledge representation Prior question: 1. What are the activation patterns — data structures — mental representations — evaluated by these constraints?

  25. Representation • Symbolic theory • Complex symbol structures • Generative linguistics (Chomsky & Halle ’68 …) • Particular linguistic representations • Markedness Theory (Jakobson, Trubetzkoy, ’30s …) • Good (well-formed) linguistic representations • Connectionism(PDP) • Distributed activation patterns • ICS • realization of (higher-level) complex symbolic structures in distributed patterns of activation over (lower-level) units (‘tensor product representations’ etc.) • will employ ‘local representations’ as well

  26. σ σ k k æ t æ t σ/rε k/r0 æ/r01 t/r11 [σ k [æ t]] Representation

  27. i i, j, k∊{A, B, X, Y} jk Depth 0 Depth 1 ⑤ ⑨ ① Filler vectors:A, B, X, Y ⑩ ② ⑥ ⑦ ⑪ ③ ⑧ ⑫ ④ ⊗ Role vectors:rε = 1 r0 = (1 1) r1 = (1 –1) Tensor Product Representations • Representations:

  28. i i i, j, k∊{A, B, X, Y} i, j, k∊{A, B, X, Y} • Representations: jk jk Depth 0 Depth 1 Depth 0 Depth 1 ⑤ ⑨ ① ⑤ ⑨ ① ① Filler vectors:A, B, X, Y Filler vectors:A, B, X, Y ⑩ ② ⑥ ⑩ ② ② ⑥ ⑦ ⑪ ③ ⑦ ⑪ ③ ③ ⑧ ⑫ ④ ⑧ ⑫ ④ ④ ⊗ Role vectors:rε = 1 r0 = (1 1) r1 = (1 –1) Role vectors:rε = 1r0 = (1 1) r1 = (1 –1) Tensor Product Representations ⊗

  29. i i i, j, k∊{A, B, X, Y} i, j, k∊{A, B, X, Y} • Representations: jk jk Depth 0 Depth 1 Depth 0 Depth 1 ⑤ ⑨ ① ⑤ ⑨ ① ① Filler vectors:A, B, X, Y Filler vectors:A, B, X, Y ⑩ ② ⑥ ⑩ ② ② ⑥ ⑦ ⑪ ③ ⑦ ⑪ ③ ③ ⑧ ⑫ ④ ⑧ ⑫ ④ ④ ⊗ ⊗ Role vectors:rε = 1 r0 = (1 1) r1 = (1 –1) Role vectors:rε = 1 r0 = (1 1) r1 = (1 –1) Tensor Product Representations

  30. Local tree realizations • Representations:

More Related