1 / 7

Resources for paraphrase detection

Resources for paraphrase detection. Caroline Hagège Caroline.Hagege@xrce.xerox.com Caroline Brun Caroline.Brun@xrce.xerox.com. Resource types. Derivational morphology Deep syntax Domain-specific resources. 1. Derivational morphology. Use of the CELEX database (distributed by the LDC)

harris
Télécharger la présentation

Resources for paraphrase detection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Resources for paraphrase detection Caroline Hagège Caroline.Hagege@xrce.xerox.com Caroline Brun Caroline.Brun@xrce.xerox.com

  2. Resource types • Derivational morphology • Deep syntax • Domain-specific resources

  3. 1. Derivational morphology • Use of the CELEX database (distributed by the LDC) • http://www.kun.nl/celex/index.html • Hand made revision of the extracted pairs in order to typify the kind of relations (predicate) between them. • Automatic extraction of verbs and corresponding deverbal nouns • Suffixes: +OR, +ER, +ION • Predicates relating noun-verbs from the same morphological family (~ 1600 pairs) • Predicate types: • S0 : The noun paraphrases the action expressed by the verb. • e.g. S0(acceleration,accelerate) • S1H : The noun corresponds to the first actant of the action • expressed by the verb and has a human:+ feature. • e.g. S1H(writer,write)

  4. 1. Derivational morphology (cntd) • S1NH : The noun corresponds to the first actant of the action • expressed by the verb and has a human:~ feature. • e.g. S1NH(abbreviation,abbreviate) • S2 : The noun corresponds to the second actant of the action • expressed by the verb. • E.g. S2(affirmation,affirm) • Automatic extraction of noun and corresponding adjective • Suffix: +AN

  5. 2. Deep syntax • Use of Comlex lexicon (Grisham & al. 1994) in order to extract logical subject/objects of infinitives. • Example 1 “He ordered Peter to go” • SUBJ-N(order,he), OBJ-N(order,Peter), SUBJ-N(go,Peter) • Example 2 “He promised Peter to go” • SUBJ-N(promise,he), OBJ-N(promise,Peter), SUBJ-N(go,he) • Active-Passive transformation • Use of verb class alternation (Levin 93) • Example 3 “Acetone burns easily” • SUBJ-N(burn,VARIABLE), OBJ-N(burn,acetone),

  6. About 120 rules exploiting the derivational morphology and deep syntactic resources are necessary for the general normalization grammar.

  7. 3. Domain-specific resource • Hand-made resources. Directly encoded as XIP rules • Creation of specific relations between lexical items (about 30 relations) • SYNONYMY relations e.g. odor-smell • HASN relation e.g. evaporate-volatility • TURNTO relation e.g. evaporate-vapor • ISAJ relation e.g. burn-burnable • Elaboration of XIP rules exploiting these relations and the normalized syntactic analysis (about 150 rules) • If ( SUBJ-N(#1[lem:have],#2) & OBJ-N(#1,#3) & HASN(#4,#3) ) • PROPERTY(#2,#4) • This rule gives equivalent representations to • X has volatility and X evaporates, X has flammability and X burns etc.

More Related