1 / 24

Computational Linguistics Introduction

Computational Linguistics Introduction. Context Free Grammars. Chomsky Hierarchy. Context Free Grammars (CFGs). Sets of rules expressing how symbols of the language fit together, e.g. S -> NP VP NP -> Det N Det -> the N -> dog. What Does Context Free Mean?.

lenora
Télécharger la présentation

Computational Linguistics Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computational Linguistics Introduction Context Free Grammars CLINT-LN CFG

  2. Chomsky Hierarchy CLINT-LN CFG

  3. Context Free Grammars (CFGs) Sets of rules expressing how symbols of the language fit together, e.g.S -> NP VPNP -> Det NDet -> theN -> dog CLINT-LN CFG

  4. What Does Context Free Mean? • LHS of rule is just one symbol. • Can haveNP -> Det N • Cannot haveX NP Y -> X Det N Y • Everyday examples of context sensitivity CLINT-LN CFG

  5. What Does Context Free Mean? • LHS of rule is just one symbol. • Can haveNP -> Det N • Cannot haveX NP Y -> X Det N Y • Everyday examples of context sensitivityy e -> i e[b]# -> [p]# CLINT-LN CFG

  6. Grammar Symbols • Non Terminal Symbols • Terminal Symbols • Words • Preterminals CLINT-LN CFG

  7. Non Terminal Symbols • Symbols which have definitions • Symbols which appear on the LHS of rulesS-> NP VPNP -> Det NDet -> theN-> dog CLINT-LN CFG

  8. Non Terminal Symbols • Same Non Terminals can have several definitionsS-> NP VPNP -> Det N NP -> N Det -> theN-> dog CLINT-LN CFG

  9. TerminalSymbols • Symbols which appear in final string • Correspond to words • Are not defined by the grammar S -> NP VPNP -> Det NDet -> theN -> dog CLINT-LN CFG

  10. Parts of Speech (POS) • NT Symbols which produce terminal symbols are sometimes called pre-terminals S -> NP VPNP -> Det NDet -> theN-> dog • Sometimes we are interested in the shape of sentences formed from pre-terminalsDet N VAux N V D N CLINT-LN CFG

  11. CFG - formal definition A CFG is a tuple (N,,R,S) • N is a set of non-terminal symbols •  is a set of terminal symbols disjoint from N • R is a set of rules each of the form A   where A is non-terminal • S is a designated start-symbol CLINT-LN CFG

  12. grammar: S  NP VP NP  N VP  V NP lexicon: V  kicks N  John N  Bill N = {S, NP, VP, N, V}  = {kicks, John, Bill} R = (see opposite) S = “S” CFG - Example CLINT-LN CFG

  13. Class Exercise • Write grammars that generate the following languages, for m > 0 (ab)m anbm anbn • Which of these are Regular? • Which of these are Context Free? CLINT-LN CFG

  14. (ab)m for m > 0 S -> a b S -> a b S CLINT-LN CFG

  15. (ab)m for m > 0 S -> a b S -> a b S S -> a X X -> b Y Y -> a b Y -> S CLINT-LN CFG

  16. S -> A B A -> a A -> a A B -> b B -> b B anbm CLINT-LN CFG

  17. S -> A B A -> a A -> a A B -> b B -> b B S -> a AB AB -> a AB AB -> B B -> b B -> b B anbm CLINT-LN CFG

  18. NP VP N V NP N John kicks Bill Grammar Defines a Structure grammar: S  NP VP NP  N VP  V NP lexicon: V  kicks N  John N  Bill S CLINT-LN CFG

  19. NP N Bill Different Grammar Different Stucture grammar: S  NP NP NP  N V NP  N lexicon: V  kicks N  John N  Bill S NP N V John kicks CLINT-LN CFG

  20. Which Grammar is Best? • The structure assigned by the grammar should be appropriate. • The structure should • Be understandable • Allow us to make generalisations. • Reflect the underlying meaning of the sentence. CLINT-LN CFG

  21. Ambiguity • A grammar is ambiguous if it assigns two or more structures to the same sentence. NP  NP CONJ NP NP  N lexicon: CONJ  and N  John N  Bill • The grammar should not generate too many possible structures for the same sentence. CLINT-LN CFG

  22. Criteria for Evaluating Grammars • Does it undergenerate? • Does it overgenerate? • Does it assign appropriate structures to sentences it generates? • Is it simple to understand? How many rules are there? • Does it contain just a few generalisations or is it full of special cases? • How ambiguous is it? How many structures does it assign for a given sentence? CLINT-LN CFG

  23. PATR-II • The PATR-II formalism can be viewed as a computer language for encoding linguistic information. • A PATR-II grammar consists of a set of rules and a lexicon. • The rules are CFG rules augmented with constaints. • The lexicon provides information about terminal symbols. CLINT-LN CFG

  24. Grammar (grammar.grm) Rule s -> np vp Rule np -> n Rule vp -> v Lexicon (lexicon.lex) \w uther \c n \w sleeps \c v Example PATR-IIGrammar and Lexicon CLINT-LN CFG

More Related