1 / 37

Spanish verbs and verb-noun collocations paraphrase pairs

CBA 2010 Corpus-Based Approaches to Praphrasing and Nominalization Barcelona, 1-2 diciembre 2010. Spanish verbs and verb-noun collocations paraphrase pairs. María A. Barrios, auxiba@filol.ucm.es Luz Rello, <luzrello@gmail.com>. Outline. Objectives Introduction: Meaning-Text Theory

merrill
Télécharger la présentation

Spanish verbs and verb-noun collocations paraphrase pairs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CBA 2010 Corpus-Based Approaches to Praphrasing and Nominalization Barcelona, 1-2 diciembre 2010 Spanish verbs and verb-noun collocations paraphrase pairs María A. Barrios, auxiba@filol.ucm.es Luz Rello, <luzrello@gmail.com>

  2. Outline • Objectives • Introduction: Meaning-Text Theory • BADELE.3000, a linguistic resource • Paraphrase Rule 18 from Meaning-Text Theory • Paraphrase pairs of verb and verb-noun collocations • Verb-noun collocations that have no verbal counterpart • Verbs that have no verb-noun collocation counterpart • False paraphrases • Conclusions

  3. 1. Objectives (I) • To present a linguistic resource, BADELE.3000, useful for NLP applications • To describe cases of paraphrase pairs composed of verbs which can be paraphrased with support verb+noun collocations, such as ducharse (to shower), darse una ducha (to have a shower) 3. To describe cases of false paraphrase pairs: words with a morfological but not semantic relationship, such us expedir (to issue), hacer una expedición (to explore)

  4. 1. Objectives (II) • To describe cases in which a verb+noun collocation has no verbal counterpart, such as *problemear tener un problema (to have a problem) • To describe cases in which a verb has no verb+noun collocationcounterpart, such as sincerarse (to open up to someone) *hacer una sinceridad

  5. Outline • Objectives • Introduction: Meaning-Text Theory • BADELE.3000, a linguistic resource • Paraphrase Rule 18 from Meaning-Text Theory • Paraphrase pairs of verb and verb-noun collocations • Verb-noun collocations that have no verbal counterpart • Verbs that have no verb-noun collocation counterpart • False paraphrases • Conclusions

  6. 2. Introduction (I): MTT • 1. Lexical function (LF)associates a given lexical expression L (such as sound), which is the argument or keyword of F, with a set of lexical expressions –the value of F (such as loud, strong, heavy, deafening,etc).– expressing a specific meaning associated with F (for instance, ‘intense’ = Magn). • sound= argument or keyword of LF • loud, strong, heavy, deafening, etc. = values of LF • ‘intense’ (specific meaning associated with Magn) • Magn(sound) = loud, strong, heavy, deafening

  7. 2. Introduction (II): MTT • More than 100 different universal LFs • Oper1 (ducha) = darse, Oper1 (shower) = to have, Oper1 (douche) = prendre, Oper1 (doccia) = fare • Si (the nameof the participant in a situation) S1(school) = student; S2(school) = teacher • V0(reception) = to receive S0(to receive) = reception

  8. 2. Introduction (III): MTT • 2. Semantic labelis the equivalent to the genus in traditional definitions by genus and differentia. • Whale: ‘sea mammal that breathes air through a hole at the top of its head and is hunted for meat and for other purposes, as a source of other materials’ • Hierarchy of semantic label • Living being > Animal > Vertebrate > Mammal > Sea Mammal

  9. 2. Introduction (IV): MTT • 3. Actantscorrespond to beings or things that participate in the process expressed by a predicate:MTT approach considers that there is a sort of argument structure in all kinds of predicative words, which means that not only the verbs have actants but also the adjectives, the adverbs and the predicative nouns. • The actantial structure reflects the syntactic expression of the actants, • River[WHICH STARTS AT THE X place, FLOWS THROUGH THE Z places AND FINISHES AT THE Y area]

  10. 2. Introduction (V): MTT • LFs have proved to be a specially helpful tool for lexicographic works such as the French database Dicouèbe[1](developed in Montreal by Polguère and Mel’cuk), the Spanish database DiCE[2] (developed in La Coruña by Alonso Ramos), [3] the automatic translator ETAP3 (developed in Moscow by Apresjan, Boguslavsky et al) and [4] multilingual generation and paraphrasing systems (developed in Barcelona by Leo Wanner). [1]http://olst.ling.umontreal.ca/dicouebe/ [2]http://www.dicesp.com/ [3]http://cl.iitp.ru/etap [4]http://www.barcelonamedia.org/files/292.pdf

  11. Outline • Objectives • Introduction: Meaning-Text Theory • BADELE.3000, a linguistic resource • Paraphrase Rule 18 from Meaning-Text Theory • Paraphrase pairs of verb and verb-noun collocations • Verb-noun collocations that have no verbal counterpart • Verbs that have no verb-noun collocation counterpart • False paraphrases • Conclusions

  12. 3. BADELE.3000, a linguistic resource (I) BADELE.3000 (Barrios & Bernardos, 2007) is a database that contains the 3,300 most frequently used Spanish nouns and the 3,300 most frequently used Spanish verbs, 20,700 relations were formalized by means of LFs BADELE.3000 is useful for natural language processing applications and ontologies (Barrios, Aguado de Cea and Ramos, 2009a), (Barrios, Aguado de Cea and Ramos, 2009b), (Barrios and Vilches, 2010). 9,000 lexical relations were obtained automatically by semantic labels and LFs 21,700 lexical relations were added manually (Bosque, 2004, 2006)

  13. 3. BADELE.3000, a linguistic resource (II) • Inheritance Principle: those lexical units sharing a semantic label can inherit their LFs values automatically • Fact0(‘means of transport’) = to work (‘to do what is supposed to be done’) • ‘means of transport’ = bus, ship, train, motorbike, plane • Fact0(bus) = to work, to run, to operate • Fact0(ship) = to work, to sail, to navigate • Fact0(train) = to work, to run, to operate • Fact0(motorbike) = to work, to run • Fact0(plane) = to work, to flight, to glide, to fly over

  14. 3,300 most frequently used Spanish nouns 5. BADELE.3000 (I) Semantic label: ‘means of transport’ El barco funciona = the ship works the bus works, the bus runs, the bus operates; the ship sails, the ship navigates; the plane works, the plane flights, the plane glides, the plane flies over Ontology Engineering Group 14

  15. 3. BADELE.3000, a linguistic resource (III) CausFunc0 means ‘to cause something to exist’: CausFunc0(ropa) = confeccionar >una camiseta/ pantalones, etc. CausFunc0(clothes) = to make > to make a T-shirt/ trousers/ skirt CausFunc0(obra artística) = componer > poema/ libro/ argumento CausFunc0(artistic work) = to compose> a poem/ book/ plot/ etc. CausFunc0(vivienda) = construir > una casa/ rascacielos/ apartamento, etc. CausFunc0(accommodation) = to build > to build a house/ skyscraper/ apartment/ etc. CausFunc0(energía) = producir > luz/ gas/ petróleo, etc. CausFunc0(energy) = to produce> to produce light/ gas/ petrol

  16. Outline • Objectives • Introduction: Meaning-Text Theory • BADELE.3000, a linguistic resource • Paraphrase Rule 18 from Meaning-Text Theory • Paraphrase pairs of verb and verb-noun collocations • Verb-noun collocations that have no verbal counterpart • Verbs that have no verb-noun collocation counterpart • False paraphrases • Conclusions

  17. 4. Paraphrase rule 18 (I) • The paraphrase rule 18 is called “Fissions à verbe support” (Mel’cuk et al, 1998), and is transcribed as: “Given a verb (Vo), such as to receive, paradigmatically related to a noun (So), such as reception, if the noun appears in a collocation together with a support verb Oper, such as to give a reception, both verbal expressions (to receive, to give a reception)are interchangeable. to receive  Oper1(reception) = to give a reception • Support verbsare values of LFs Oper, Func and Labor(Mel’čuk, 1996, 68), such as to deal a blow, to receive a blow from, blow comes from, blow falls upon

  18. Outline • Objectives • Introduction: MTT tools • BADELE.3000, a linguistic resource • Paraphrase Rule 18 from Meaning-Text Theory • Paraphrase pairs of verb and verb-noun collocations • Verb-noun collocations that have no verbal counterpart • Verbs that have no verb-noun collocation counterpart • False paraphrases • Conclusions

  19. 5. Paraphrase pairs (I) There are more than 700 nouns in BADELE.3000 whose meaning is equivalent to the meaning of some verbs, related to V0 and S0 LFs. We listed these nouns as potential nouns in verb-noun collocations: V0(blow) = to beat S0(to beat) = blow V0(resistance) = to resist S0(to resist) = resistance V0(order) = to order S0(to order) = order

  20. 5. Paraphrase pairs (II) • Then we found the support verb collocations of these nouns • to deal a blow • to put up resistance • to give an order And then we attached these support verb-noun collocations to the equivalent verb: • to bang/ beat/ hit…. to deal a blow • to resist … to put up resistance • to order … to give an order

  21. 5. Paraphrase pairs (III) 777 paraphrase pairs of verb and verb-noun collocation were found in BADELE.3000 to select to make a selection to reject to show rejection to assist to give assistance to support to give support to research to do a research to control to subject to control to spread to make propaganda to define to formulate a definition to remember to have a memory

  22. Outline • Objectives • Introduction: MTT tools • BADELE.3000, a linguistic resource • Paraphrase Rule 18 from Meaning-Text Theory • Paraphrase pairs of verb and verb-noun collocations • Verb-noun collocations that have no verbal counterpart • Verbs that have no verb-noun collocation counterpart • False paraphrases • Conclusions

  23. 6. Verb-noun collocations that have no verbal counterpart (I) 5. Verb-noun collocations Lack of verbs was found frequent in nouns which denote illnesses and feelings: *gripear (*to flu)> tener gripe (to have flu) *diabetear (*to diabetes) > tener diabetes (to have diabetes) *soledear (*to lonely) > sentir soledad (to feel loneliness) *felicidadear (*to happy) > sentir felicidad (to feel happiness) And also with physical facts and non physical facts:

  24. Outline • Objectives • Introduction: MTT tools • BADELE.3000, a linguistic resource • Paraphrase Rule 18 from Meaning-Text Theory • Paraphrase pairs of verb and verb-noun collocations • Verb-noun collocations that have no verbal counterpart • Verbs that have no verb-noun collocation counterpart • False paraphrases • Conclusions

  25. 7. Verbs that have no verb-noun collocations counterpart • Previous work (Barrios, 2010) pointed out that abstract nouns tend to form collocations using support verbs. There are some exceptions to this generalization (Table 2)

  26. Table 2

  27. Outline • Objectives • Introduction: MTT tools • BADELE.3000, a linguistic resource • Paraphrase Rule 18 from Meaning-Text Theory • Paraphrase pairs of verb and verb-noun collocations • Verb-noun collocations that have no verbal counterpart • Verbs that have no verb-noun collocation counterpart • False paraphrases • Conclusions

  28. 8. False paraphrases There are cases where the verb and the noun occurring in the collocation have different meaning: 1. Tener frío (to be cold) ≠ enfriarse (to get cold) 2. Tener cansancio (to be tired) ≠ cansarse (to get tired) 3. Hacer una expedición (to do an expedition) ≠ expedir (to issue) 4. Tener sueño (to be sleepy) ≠ soñar (to dream)

  29. 8. False paraphrases • 5. comprar (to buy) • Hacer una compra (*to make a purchase) • Hacer la compra (to do the shopping) • Ir de compras (to go shopping) 6. responder (to answer) = dar una respuesta (to give an answer) • Respondió que no lo sabía • Dio una respuesta *que no lo sabía

  30. 8. False paraphrases • 7. preguntar (to ask) = hacer una pregunta (to ask a question) • Le preguntó dónde vivía • (He ask him where he lives) • Le hizo una pregunta *dónde vivía • (He ask him a question *where he lives) • Le hizo una pregunta: ¿dónde vives? • (He ask him a question: where do you live?)

  31. Outline • Objectives • Introduction: MTT tools • BADELE.3000, a linguistic resource • Paraphrase Rule 18 from Meaning-Text Theory • Paraphrase pairs of verb and verb-noun collocations • Verb-noun collocations that have no verbal counterpart • Verbs that have no verb-noun collocation counterpart • False paraphrases • Conclusions

  32. 9. Conclusions • We have presented BADELE.3000, a linguistic resource useful for natural language processing applications and ontologies. • We have describedparaphrase pairs composed of verbs which can be paraphrased with support verb+noun collocations;cases of false paraphrase pairs: words with a morfological but not semantic relationship; cases in which a verb+noun collocation has no verbal counterpart; and cases in which a verb has no verb+noun collocationcounterpart.

  33. 9. Conclusions We have defended that the rule 18 is quite useful when getting collocations automatically, but paraphrase pairs of verbs and verb-noun collocations sometimes are not interchangeable semantically, and frequently are not interchangeable syntactically.

  34. 10. References Barrios, M. A. 2010. El dominio de las funciones léxicas en el marco de la Teoría Sentido-Texto. ELIES, 30. http://elies.rediris.es/elies30/index30.html Barrios, Aguado de Cea and Ramos, 2009a. Semantic labels and genus: improving specialized domain definitions, M. Claude L’Homme, Silvye Szulman (eds.). Proceedings of the 8th International Conference on Terminology and Artificial Intelligence. ISSN 1613-0073. http://ftp.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-578/ Barrios, Aguado de Cea and Ramos, 2009b. Enriching a lexicographic tool with domain definitions: Problems and solutions. First International Workshop on Definition Extraction. G. Sierra, M. Pozzi, J. M. Torres Moreno. Proceedings of the 1st International Workshop on Definition Extraction. LPN. Borovets.,14-20

  35. 10. References Barrios M. A. & Bernardos S., 2007. “BaDELE.3000: An implementation of the lexical inheritance principle”. In Gerdes et al. Proceedings of the Fourth International Conference on Meaning-Text Theory. Observatoire de linguistique Sens-Texte (OLST), Montreal,97-106. Barrios and Vilches, 2010. It is possible to enrich ontologies with a specialized domain linguistic resource? Workshop Establishing and using ontologies as a basis for terminological and knowledge engineering. TKE Conference. Dublin, 2010. Mel’cuk, I. 1996. “Lexical functions: A tool for the description of lexical relations in a lexicon”. In Wanner, L. (ed.), Lexical functions in lexicography and natural language processing. Amsterdam/ Philadelphia. John Benjamin. 37-102. Mel’cuk et al, 1998, Dictionnaire explicatif et combinatoire du Français Contemporain. Recherches lexico-semantiques II. Montreal. Les Presses de l’Université de Montréal.

  36. CBA 2010 Corpus-Based Approaches to Praphrasing and Nominalization Barcelona, 1-2 diciembre 2010 Thank you!

More Related