190 likes | 327 Vues
The SALSA Project focuses on the semantic annotation of over 0.8 million words of syntactically annotated German newspaper text (TIGER Corpus). Employing frames and frame elements from the Berkeley FrameNet Database, it addresses cross-lingual divergencies and corpus-driven lexicon development. The SALSA initiative leverages both manual and automatic annotation tools, including Fred, Rosy, and Shalmaneser, to achieve consistent and comprehensive semantic analysis. Future work includes bootstrapping frame information and linking resources with ontologies for improved textual entailment applications.
E N D
SALSAThe Saarbrücken Lexical Semantics Annotation & Acquisition Project Aljoscha Burchardt, Katrin Erk, Anette Frank, Andrea Kowalski, Sebastian Pado,Manfred Pinkal
Semantic Annotation in SALSA • Manual semantic annotation of • 0.8 million words of syntactically annotated German newspaper text (TIGER Corpus, Releases 1, 2) • with frames and frame elements (Berkeley FrameNet Database), staying as close as possible to the Berkeley FrameNet database
SALSA: What's special? • SALSA is about German • Cross-lingual divergencies?
Cross-lingual Divergencies • Convincing cross-lingual portability results (ED) in general • Adaptation necessary because of • Inappropriate granularity of distinctions between FEs • Missing FEs • (Rare cases of) inappropriate granularity of frames
SALSA: What's special? • SALSA is about German • Cross-lingual divergencies? • Corpus-driven lexicon development through exhaustive full-text annotation • Difficult cases • Incompleteness of Berkeley FrameNet
Difficult cases • Metaphors • Support Verb Constructions • Idioms
SALSA corpus: Release I • Total size of 20.000 annotated instances • Consistent annotation through different verification steps • All occurrences/readings of > 400 German verbal predicates (different frequency bands) • Scheduled for Summer 2006
SALSA II: Automatic Annotation and Acquisition • Fred, Rosy, and Shalmaneser: A tool-chain for shallow semantic analysis Talk by Katrin and Sebastian
SALSA II: Automation • Fred, Rosy, and Shalmaneser: A tool-chain for shallow semantic analysis Talk by Katrin and Sebastian • The Detour System (through WordNet to FrameNet) Talk by Anette and Al
Fred & Rosy Fred, Detour & Rosy
SALSAII: Automation • Fred, Rosy, and Shalmaneser: A tool-chain for shallow semantic analysis Talk by Katrin and Sebastian • The Detour System (through WordNet to FrameNet) Talk by Anette and Al • Cross-lingual projection of frame-semantic information Katrin and Sebastian
SALSAII: Automation & Application • Fred, Rosy, and Shalmaneser: A tool-chain for shallow semantic analysis Talk by Katrin and Sebastian • The Detour System (through WordNet to FrameNet) Talk by Anette and Al • Cross-lingual projection of frame-semantic information Katrin and Sebastian • Textual Entailment (RTE) Anette and Al
h: Aki Kaurismäki directed a film. t: In 1983, Aki Kaurismäki directed his first full-time feature.
Grammatically related h: Aki Kaurismäki directed a film. WordNet related t: In 1983, Aki Kaurismäki directed his first full-time feature.
SALSA: Future Work • Bottstrapping frame information by data expansion techniques • Linking lexical semantic resourcs with upper-model ontologies • Analysis of non-compositional phenomena • A worked-out semantic lexicon • Application to textual entailment