1 / 24

Tree Kernel-based Semantic Relation Extraction using Unified Dynamic Relation Tree

Tree Kernel-based Semantic Relation Extraction using Unified Dynamic Relation Tree. Reporter: Longhua Qian School of Computer Science and Technology Soochow University , Suzhou, China 2008.07.23 ALPIT2008, DaLian, China. Outline. 1. Introduction 2. Dynamic Relation Tree

falala
Télécharger la présentation

Tree Kernel-based Semantic Relation Extraction using Unified Dynamic Relation Tree

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tree Kernel-based Semantic Relation Extraction using Unified Dynamic Relation Tree Reporter: Longhua Qian School of Computer Science and Technology Soochow University, Suzhou, China 2008.07.23 ALPIT2008, DaLian, China

  2. Outline • 1. Introduction • 2. Dynamic Relation Tree • 3. Unified Dynamic Relation Tree • 4. Experimental results • 5. Conclusion and Future Work

  3. 1. Introduction • Information extraction is an important research topic in NLP. • It attempts to find relevant information from a large amount of text documents available in digital archives and the WWW. • Information extraction by NIST ACE • Entity Detection and Tracking (EDT) • Relation Detection and Characterization (RDC) • Event Detection and Characterization (EDC)

  4. RDC • Function • RDC detects and classifies semantic relationships (usually of predefined types) between pairs of entities.Relation extraction is very useful for a wide range of advanced NLP applications, such as question answering and text summarization. • E.g. • The sentence “Microsoft Corp. is based in Redmond, WA” conveys the relation “GPE-AFF.Based” between “Microsoft Corp” (ORG) and “Redmond” (GPE).

  5. Two approaches • Feature-based methods • have dominated the research in relation extraction over the past years. However, relevant research shows that it’s difficult to extract new effective features and further improve the performance. • Kernel-based methods • compute the similarity of two objects (e.g. parse trees) directly. The key problem is how to represent and capture structured information in complex structures, such as the syntactic information in the parse tree for relation extraction?

  6. Kernel-based related work • Zelenko et al. (2003), Culotta and Sorensen (2004), Bunescu and Mooney (2005) described several kernels between shallow parse trees or dependency trees to extract semantic relations. • Zhang et al. (2006), Zhou et al. (2007) proposed composite kernels consisting of an linear kernel and a convolution parse tree kernel, and the latter can effectively capture structured syntactic information inherent in parse trees.

  7. Structured syntactic information • A tree span for relation instance • a part of a parse tree used to represent the structured syntactic information for relation extraction. • Two currently used tree spans • PT(Path-enclosed Tree): the sub-tree enclosed by the shortest path linking the two entities in the parse tree • CSPT(Context-Sensitive Path-enclosed Tree): Dynamically determined by further extending the necessary predicate-linked path information outside PT.

  8. Current problems • Noisy information • Both PT and CSPT may still contain noisy information. In other words, more noise should be pruned away from a tree span. • Useful information • CSPT only captures part of context-sensitive information only relating to predicate-linked path. That is to say, more information outside PT/CSPT may be recovered so as to discern their relationships.

  9. Our solution • Dynamic Relation Tree (DRT) • Based on PT, we apply a variety of linguistics-driven rules to dynamically prune out noisy information from a syntactic parse tree and include necessary contextual information. • Unified Dynamic Relation Tree (UDRT) • Instead of constructing composite kernels, various kinds of entity-related semantic information, including entity types/sub-types/mention levels etc., are unified into a Dynamic Relation Tree.

  10. 2. Dynamic Relation Tree • Generation of DRT • Starting from PT, we further apply three kinds of operations (i.e. Remove, Compress, and Expansion) sequentially to reshaping PT, giving rise to a Dynamic Relation Tree at last. • Remove operation • DEL_ENT2_PRE: Removing all the constituents (except the headword) of the 2nd entity • DEL_PATH_ADVP/PP: Removing adverb or preposition phrases along the path

  11. DRT(cont’) • Compress operation • CMP_NP_CC_NP: Compressing noun phrase coordination conjunction • CMP_VP_CC_VP: Compressing verb phrase coordination conjunction • CMP_SINGLE_INOUT: Compressing single in-and-out nodes • Expansion operation • EXP_ENT2_POS: Expanding the possessive structure after the 2nd entity • EXP_ENT2_COREF: Expanding entity coreferential mention before the 2nd entity

  12. Some examples of DRT

  13. 3.Unified Dynamic Relation Tree T1: DRT T2: UDRT-Bottom T3: UDRT-Entity T4: UDRT-Top

  14. Four UDRT setups T1: DRT there is no entity-related information except the entity order (i.e. “E1” and “E2”). T2: UDRT-Bottom the DRT with entity-related information attached at the bottom of two entity nodes T3: UDRT-Entity the DRT with entity-related information attached in entity nodes T4: UDRT-Top the DRT with entity-related feature attached at the top node of the tree.

  15. 4. Experimental results • Corpus Statistics • The ACE RDC 2004 data contains 451 documents and 5702 relation instances. It defines 7 entity major types, 7 major relation type and 23 relation subtypes. • Evaluation is done on 347 (nwire/bnews) documents and 4307 relation instances using 5-fold cross-validation. • Corpus processing • parsed using Charniak’s parser (Charniak, 2001) • Relation instances are generated by iterating over all pairs of entity mentions occurring in the same sentence.

  16. Classifier • Tools • SVMLight (Joachims 1998) • Tree Kernel Tooklits (Moschitti 2004) • The training parameters C (SVM) and λ (tree kernel) are also set to 2.4 and 0.4 respectively. • One vs. others strategy • which builds K basic binary classifiers so as to separate one class from all the others.

  17. Contribution of various operation rules • Each operation rule is incrementally applied on the previously derived tree span. • The plus sign preceding a specific rule indicates that this rule is useful and will be added automatically in the next round. • Otherwise, the performance is unavailable.

  18. Comparison of different UDRT setups • Compared with DRT, the Unified Dynamic Relation Trees (UDRTs) with only entity type information significantly improve the F-measure by average 10 units due to the increase both in precision and recall. • Among the three UDRTs, UDRT-Top achieves slightly better performance than the other two.

  19. Improvements of different tree setups over PT • Dynamic Relation Tree (DRT) performs better that CSPT/PT setups. • the Unified Dynamic Relation Tree with entity-related semantic features attached at the top node of the parse tree performs best.

  20. Comparison with best-reported systems • It shows that our UDRT-Top performs best among tree setups using one single kernel, and even better than the two previous composite kernels.

  21. 5. Conclusion • Dynamic Relation Tree (DRT), which is generated by applying various linguistics-driven rules, can significantly improve the performance over currently used tree spans for relation extraction. • Integrating entity-related semantic information into DRT can further improve the performance, esp. when they are attached at the top node of the tree.

  22. Future Work • we will focus on semantic matching in computing the similarity between two parse trees, where semantic similarity between content words (such as “hire” and “employ”) would be considered to achieve better generalization.

  23. References • Bunescu R. C. and Mooney R. J. 2005. A Shortest Path Dependency Kernel for Relation Extraction. EMNLP-2005 • Chianiak E. 2001. Intermediate-head Parsing for Language Models. ACL-2001 • Collins M. and Duffy N. 2001. Convolution Kernels for Natural Language. NIPS-2001 • Collins M. and Duffy, N. 2002. New Ranking Algorithm for Parsing and Tagging: Kernel over Discrete Structure, and the Voted Perceptron. ACL-02 • Culotta A. and Sorensen J. 2004. Dependency tree kernels for relation extraction. ACL’2004. • Joachims T. 1998. Text Categorization with Support Vector Machine: learning with many relevant features. ECML-1998 • Moschitti A. 2004. A Study on Convolution Kernels for Shallow Semantic Parsing. ACL-2004 • Zelenko D., Aone C. and Richardella A. 2003. Kernel Methods for Relation Extraction. Journal of MachineLearning Research. 2003(2): 1083-1106 • Zhang M., , Zhang J. Su J. and Zhou G.D. 2006. A Composite Kernel to Extract Relations between Entities with both Flat and Structured Features. COLING-ACL’2006. • Zhao S.B. and Grisman R. 2005. Extracting relations with integrated information using kernel methods. ACL’2005. • Zhou G.D., Su J., Zhang J. and Zhang M. 2005. Exploring various knowledge in relation extraction. ACL’2005.

  24. End Thank You!

More Related