qamra
Uploaded by
10 SLIDES
247 VUES
100LIKES

Advancements in Case Frame Construction: Weekly Report from Semantic Web Research Center

DESCRIPTION

This weekly report outlines ongoing projects and to-do lists at the Semantic Web Research Center as of February 17, 2010. It details recent works, including scope development for case frames and verb analysis in CoreNet. The report highlights progress in extracting usages from a large POS-tagged corpus and constructing models for automatic case frame extraction. Key issues such as the challenges of handling complex predicates and the time-consuming nature of manual construction are discussed. Future plans include organizing a database for usage and developing tools for efficient case frame construction.

1 / 10

Télécharger la présentation

Advancements in Case Frame Construction: Weekly Report from Semantic Web Research Center

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript

Playing audio...

  1. Weekly Report 2010. 2. 17 DuhyeonJin Semantic Web Research Center

  2. Contents • Last issues • To-do-list • Works

  3. Last Issues • The scope of words to give case frames • All verbs in CoreNet which don’t have case frames. • Predicate nominal (서술성명사 ) • Doing the experiment • Make a model to construct Case frames

  4. To-do-list • Selecting word entries • Experiment • Extracting usages • Doing the construction • Selecting sample words • Assigning an appropriate word sense • Assigning an appropriate concept to arguments • Calculate time duration, and make model • Make the problem specification • Make instruction for case frame construction

  5. Works: Selecting word entries • Selected 2,308 words Korean verbs 3,200 word senses from ‘현대 국어 사용 빈도 조사(2002), 국립 국어원’ Headwords2,014 Entries in CoreNet: 1,021 CorNet verbs(no case frame) 1,593 Predicate nominal 675 CorNet Adjectives(no case frame) 40

  6. Works: Extracting Usages • Extracting POS-tagged sentences. • From ‘Sejong POS-tagged corpus’ (1,006,777 sentences, 969MB) • With selected words (2,308 words) • Using algorithm in Java, in local computer.

  7. Works: Extracting Usages • Problems • Considerthe case of ‘predicate nominal’ + ‘되’ or ‘predicate nominal’+‘시키’ • Can we handle with 500 usages per a word? • Must reduce trivial usages or make a limit

  8. Works: Doing the construction • Model 1 (experimented) • Manual construction • Using Text editor + Spread sheet + CoreNet Browser • Time duration: 25 usages in 30 min. • Extraordinary time consuming. • Model 2 (assumption) • Manual construction with tools • Database + tool + CoreNet lib. • Time duration(assumption): 180 usages in 30 min. • Suppose 100 usages per one word: • Model 3 • Automatic case frame extraction • Must survey articles or need help of someone

  9. Works: Instructions for case frame construction • Making instructions on the web • http://sysx2.kaist.ac.kr/wiki/index.php/격틀구축지침 • Issues: • Modifing clauses: 개혁을강조한 사람 • '하다', '되다', '시키다'의 통사적 차이 예> '통과하다'의 경우 • 기업(NOM)이 심사(ACC)를 통과하다 • 기업(NOM)이 심사(DAT)에 통과되다 • 기업(ACC)을 심사(DAT)에 통과시키다.

  10. Plan • To Finish organizing database for usages. • Making a tool for construction using database and CoreNet library

More Related