80 likes | 227 Vues
This thesis explores the intersection of syntax representation and computational linguistics, focusing on Tree Adjoining Grammar (TAG) and its interaction with discourse, prosody, and semantics. Key areas include sentence planning and realization, dependency parsing, and the use of machine learning in dialog systems. The research investigates fixed and free word-order in German syntax, multimedia speech output, and the transformation of TAG into finite state machines. It highlights collaborative work on dependency parsing and the mapping of PropBank annotations to semantic frameworks.
E N D
Owen Rambow 6 Minutes
Interests • Theory • Syntax: its representation and computation (TAG), and its relation to discourse, prosody, semantics … • Technologies/Resources • Generation: sentence planning & realization • Dependency parsing • Dependency corpora • Applications • Generation applications (reports) • Machine translation • Dialog systems • Summarization
Thesis (1994) • Formal representation for German syntax • Issue: combination of fixed and free word-order • Solution: use description of trees, not trees (as in TAG) • Current syntax interests: Tagalog, Arabic
Generation: Sentence Planning • Work with Lyn Walker • Issue: choosing syntactic constructions to achieve communicative goals in dialog systems • Idea: use machine learning on preference-ranked options • Ongoing: multimedia/speech output (with Noémie) • Open issue: individual preferences • Summarization?
Generation: Realization • Joint work with Srinivas Bangalore, John Chen • Use of declarative TAG grammar (hand-written or extracted) • Approach: stochastic choice on trees using arborescent & linear language models • Open issues: factoring of syntactic/lexical choice
Dependency Parsing • Joint work with Alexis Nasr, Srinivas Bangalore, John Chen • Idea: transform TAG trees into FSMs/FSTs • Parse yields dependency tree • Parse FSMs CKY-style • Or, compose FSTs to a single large FST • Open issues: supertagging, probabilistic model, domain adaptation
Corpora • PropBank (Penn project): annotation of PTB with predicate-argument structures • Verb-specific arg roles, lexicon • Project: mapping PropBank to other, more “semantic” forms of annotation (VerbNet, LCS, Prague)