1 / 12

Grammar Engineering: OT Marks for Parse R anking Generation

Grammar Engineering: OT Marks for Parse R anking Generation. Miriam Butt (University of Konstanz) and Martin Forst ( NetBase Solutions). Colombo 2014. OT Marks. OT = Optimality Theory Classic OT only knows constraints, i.e. dispreferences.

shen
Télécharger la présentation

Grammar Engineering: OT Marks for Parse R anking Generation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grammar Engineering:OT Marks for Parse RankingGeneration Miriam Butt (University of Konstanz) and Martin Forst (NetBase Solutions) Colombo 2014

  2. OT Marks • OT = Optimality Theory • Classic OT only knows constraints, i.e. dispreferences. • OT as implemented in XLE uses both dispreference marks (default) as well as preference marks (prefixed with +) • Classic OT assumes a simple hierarchy of constraints • OT as implemented in XLE uses a “structured hierarchy”

  3. OT Marks (cont’d) • OT marks can be introduced in lexicon entries and in rules • OT marks are projected to a separate projection, the o-structure • The o-structure (unlike the c- and the f-structure) is not really structured; just view it as a bag of OT marks OTMarkName $ o::*

  4. OPTIMALITYORDER • Part of the grammar header • Can be modified for grammar customization • OPTIMALITYORDER is for parsing • GENOPTIMALITYORDER is for generation • OT marks can be organized into groups of equal rank OPTIMALITYORDER DisprefMark1 +PrefMark1 DisprefMark2 (DisprefMark3 DisprefMark4)

  5. Ranking Parses with OT Marks • Start on the left of OPTIMALITYORDER • Keep parses with fewest instances of DisprefMark1; consider all others suboptimal • Among remaining parses, keep those with most instances of PrefMark1; consider all others suboptimal • Among remaining parses, keep those with fewest instances of DisprefMark2; consider all others suboptimal • Etc.

  6. Special Marks in OPTIMALITYORDER • Without special marks in OPTIMALITYORDER all OT marks are used for ranking the parses after parsing proper has finished • Special marks can be introduced to make OT marks interact with parsing process • NOGOOD • CSTRUCTURE • STOPPOINT

  7. NOGOOD OT Marks • If (part of) a lexicon entry or a rule projects an OT mark that is listed to the left of NOGOOD in OPTIMALITYORDER, that part of the grammar is deactivated. • Might be used for expensive constructions or particular readings of ambiguous lexical items which are known to be of no/little importance in the application domain.

  8. CSTRUCTURE OT Marks • Intended for better performance • Resolving f-annotations is far more expensive computationally than determining possible c-structures • If we can discard certain c-structures early on, we do not even need to start resolving the associated f-annotations • Example: Guessed +MWE CSTRUCTURE

  9. STOPPOINT OT Marks • Also intended for better performance • Only beneficial when used cautiously • (Parts of) lexical entries and rules marked with STOPPOINT OT marks are not used for first parsing attempt • If first attempt is unsuccessful, the parser activates those lexicon or rule parts and makes a second attempt • Example: Mark1 Mark2 STOPPOINT

  10. Examples of Potential OT Marks • Prefer OBL interpretations of PPs over ADJUNCT interpretations The zookeeper waited for the gorilla. • Prefer ditransitive subcategorization frames over transitive ones The girl gave her brother money.

  11. Generation • XLE can generate strings from well-formed f-structures. • GENOPTIMALITYORDER can be different from OPTIMALITYORDER, both wrt. OT marks used and wrt. their ranking • Transducers can also be different; typically, the generation tokenizer is more restrictive than the parsing tokenizer

  12. Generation • For our purposes, we will parse the sentences from our exercises and regenerate. • Go to “Commands” menu of your f-structure window (bottom left) and select “Generate from this FS”

More Related