What do we know about Language Acquisition?

What do we know about Language Acquisition?

Bold claims about language acquisition Kuhl et al. (2008)

Issues • What are the data (and methods)? • Language and speech perception (indirect measures) • Looking/Listening times • Changes in brain activity • Language and speech production • Recordings of child language production (and/or input) • How do the data relate to knowledge? • Processes underlying speech perception and production • Representations stored in the mental lexicon • How can development be characterized? • How to account for variation? • Individual variation • Cross-linguistic variation

Methods in early speech perception • Discrimination (often tested through habituation) • Two speech sounds (phonetic/phonological contrast?) • Natural vs. synthetic stimuli • Categorization • More informative; meaningful differences • Learning procedure often fails • Preference • Preference for new or old? (novelty vs. familiarity) • Often with learning or familiarization phase, but can be used to test knowledge infants bring to the lab • Used to test various types of knowledge (sounds, morphemes, words, syntax, etc.) • Artificial language learning • Word learning • Mismatch detection

Methods (cont.) • Do infants speech perception experiments capture processes of natural language acquisition? • Lab situation • Repeated and often isolated presentation of stimuli • Group results • Based on short term learning • Groups based on age (not on ‘developmental age’) • Most studies focus on younger age groups • Gaps in age groups, contrasts, etc. Patchy picture • Replication often difficult • Sometimes change from familiarity to novelty • Longitudinal perception studies are rare

Methods in early speech production • Recordings of child language production • Longitudinal databases (individual children) • Focus on development • Cross-sectional databases • Focus on norms of development • Experimental data • Lab situation • What do production data tell us about • Data hard to interpret • Processes? • Representations? • Data problem: lack of dense data

Processes • What are the underlying processes behind • Perception (experiments)? • Production? • From universal to language-specific speech perception. What triggers change? • Experience-based • loss of unnecessary contrasts • enhancement of necessary contrasts • How is experience defined? • Frequency of occurrence • Saliency • Attention • Biases • Depends on the theoretical framework • Perceptual assimilation model (Best) vs. Native language magnet model (Kuhl) • Similar questions for production (Levelt)

Representations • What are the units that are represented? • Sounds, contrasts, words, chunks of words? • How are they represented? • Exemplars? • Abstract or detailed? • Depends on the linguistic theory • Same for perception and production?

Development • In general: what triggers development • Error-based? • Experience (frequency)-based? • How is frequency defined? • Abstraction & Generalization? • What information in the signal do infants use? • Do infants generalize over features, words, syntactic structures, etc.? • How do they generalize? • Role of the environment • Feedback • Social interaction • …

Modeling • Modeling could help with • Filling in the gaps (in development) • Understanding the results: what happens between input and behavioral data? • How to explain mixed results (familiarity vs. novelty, for instance) • …

Modeling Language AcquisitionLorentz workshop Modelling Meets Infant Studies in Language Acquisition, 9-13 September 2013 Rens Bod Institute for Logic, Language and Computation University of Amsterdam

What is at stake in modeling language acquisition? • Sound category learning • Word (class) learning • Morphology learning • Syntactic learning • Semantic learning • ...

Requirements for models of language acquisition (Cf. M. Frank 2013) • Models should converge to “right result” (e.g. correct set of phonetic categories, word-object mappings, or interpretations for sentences) given appropriate sample of data from any language • But do we agree on categories/representations (see later)? • Models should fit child performance, reproducing different patterns of performance by children at different ages when given corresponding input data • Do we agree on the child performance data?

“All models are wrong, but someare useful”(Box & Draper, 1987) • But models can also be empirically adequate • Should definitive answer on ‘reality of models’ come from neuroscience? • We can take models as “mediators between data and theory”, they can help us to understand the results and fill the gaps

From models to computational models • “A computational model studies the behavior of a complex system by computer simulation” Mitchell (1995) • I take the word “computational” literally • An algorithm -- not just something conceptual

How to design a model of language acquisition? Beekhuizen, Bod & Zuidema (2013), Three Design Principles of Language: The Search for Parsimony in Redundancy, Language and Speech, 56(3): • Principle 1 (Probability): Knowledge of language is sensitive to distributions of previous language experiences. An adequate computational model of language should, whenever an expression is processed, treat it as a piece of evidence that affects the probability distributions over linguistic structures

Three Design Principles of Language (2) • Principle 2 (Heterogeneity): An adequate model of human language must allow for a heterogeneous store of elementary units, ranging from single phonemes, single words and basic combinatory rules, to multiword constructions with various open slots and complete sentences. • Cf: “by and large”, “part and parcel” • “What time is it?” vs “How late is it?” • “How old are you”: <How old VBEPro>

Three Design Principles of Language (3) • Principle 3 (Redundancy, consequence of principles 1 and 2): An adequate model of human language processing and learning must be massively redundant. • Theoretical linguistics is often based on the assumption that language knowledge is categorical, homogeneousand parsimonious. • Here we emphasize that language use is probabilistic, heterogeneousandredundant.

Two Views on Language Acquisition: Rules vs Exemplars • Rule-based view: • Humans acquire grammar by abstraction of linguistic experience guided by minimal set of universal rules and constraints (Nativism) • Exemplar-based view: • Humans acquire grammar by generalizing over previous utterances, by probabilistically recombining redundant chunks (Empiricism)

How can we use the principles of probability, heterogeneity and redundancy to develop model of language learning? • Example: A class of models that takes consequence of principles above is Data-Oriented Parsing or DOP (Scha 1990, Bod 1992) • For syntax: DOP’s building blocks are “fragments” of arbitrary size and shape • Decompose corpus of previous structures into fragments • Recompose fragments incrementally into structures for new utterances, taking into account frequency and recency • Simplest instantiation of DOP based on tree structures • Fragments are “subtrees”of any size or shape (Bod 1992; Kaplan 1996; Goodman 2003; Zuidema 2010 a.o.)

Do we agree on requirementsformodeling • Model patterns of performance: e.g. various stages in language learning • Produce right results: set of phonetic categories, word-object mappings, interpretations of sentences, linguistic phenomena (aux fronting, wh questions…) • Assume as little as a priori knowledge as possible (cf Clark and Lappin 2011) • …

Conclusion: do we agree on the most basis issues? • Representations -- related to linguistic theories • Data -- performance data/input data • Categories -- at several levels (phonemes, parts of speech, syntactic constituents, semantics…) • Experiments and models of them • Model requirements (see previous slides) All are open for discussion in this workshop!

Thank You

What do we know about Language Acquisition?

What do we know about Language Acquisition?

Presentation Transcript

What do we know about St. Patrick?

What Do We Know About Implementation

What Do We Know About Regulatory Oversight

What do we know about preeclampsia?

What Do We Know about Undergraduate Learning?

What do we know about online deliberation?

What do we know about The Devil?

SHALE GAS WHAT WE KNOW WHAT WE KNOW WE DO NOT KNOW WHAT DO WE NOT KNOW WE DO NOT KNOW

What do we know about Leningrad?

What do we know about the universe?

WHAT do we know about GREAT Britain ?

What do we know when we know a language?

What Do We Know About Teacher Education?

What do we know about frogs?

What do we already know about crime?

What do we know about

What Do We Know About Our Places?

What do we know about Ancient Egypt?

What do we know about social inclusion ?

What Do We Know About Climate Change?

What do we know about rural users?

What do we know about erosion?