1 / 24

Odia Group’s Progress Report ( upto 30 th April 2013)

Odia Group’s Progress Report ( upto 30 th April 2013) . Presented by Panchanan Mohanty University of Hyderabad . Status of Synset Linking:.

asta
Télécharger la présentation

Odia Group’s Progress Report ( upto 30 th April 2013)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Odia Group’s Progress Report (upto 30th April 2013) Presented by PanchananMohanty University of Hyderabad

  2. Status of Synset Linking: • Total Linked Synsets: 36302Total Linked Noun Synsets: 27216 Total Linked Adverb Synsets: 452 Total Linked Adjective Synsets: 5827 Total Linked Verb Synsets: 2807 • Total Unique Words found = 53754

  3. Status of Synset Validation: • Total Validated Synset: 35284 Total Noun Synsets validated:  27216 Total Adverb Synsets validated: 377 Total Adjective Synsets validated: 5273 Total Verb Synsets validated: 2418 • Total Unique Words found = 53916

  4. Sense Marking Status as on 29th April 2013: • Sense Marking Status: • Total files used = 159 • Sambadnewspaper corpus: 147 files • Articles from OdishaSahitya Academy: 8 files • Other articles: 4 • Total words = 197935 • Words found or sense-marked words = 101715 • Sense-marked words percentage = 51.39%

  5. Sense-Marking Status… Added Synsetsduring Sense-Marking: • Total number of added words: 1361 • Noun=804, Verb=40, Adjective=394, Adverb=123 • Examples: • Noun: କଣ୍ଠଶିଳ୍ପୀ-4796, ଜାକଜମକ-9208 • Verb: ଖୋଳି_ଦେବା-4699, ପାସ_କରିବା-4620 • Adjective: ୫୯ଟି-8564, ୬୫ଜଣ-9488 • Adverb: ସମ୍ପୂର୍ଣ୍ଣଭାବେ-6468, ଏକାଠି-28602

  6. In the ID no 4796, the Odia word କଣ୍ଠଶିଳ୍ପୀ (kaNThasiLpi:) is an appropriate synonym of the word ଗାୟକ(ga:yaka) which is the Hindi equivalent of गायक(ga:yak). In the ID no 9208, the Odia word ଜାକଜମକ (ja:kajamaka) has also been found an appropriate synonym of the word ଧୂମଧାମ (dhu:mdha:m) which is the Hindi equivalent of धूम-धाम(dhu:mdha:m). The new words have been mentioned in red colour .

  7. In the ID no 4620, the Odia word ପାସ_କରିବା(pa:skariba:) has also been found an appropriate synonym of the word ଉତ୍ତୀର୍ଣ୍ଣ_ହେବା (utti:rNNaheba:) which is the Hindi equivalent of उत्तीर्ण होना(utti:rNNahona:). The new words have been mentioned in red colour .

  8. In the ID no 8564, the Odia words ୫୯ଟି(aNasaThiTi) and ୫୯ଜଣ(aNasaThijaNa) are appropriate synonyms of the word ଅଣଷଠି (aNasaThi) which is the Hindi equivalent of उनसठ(unsaTh). Tha word ୫୯ଟି (aNasaThiTi) is formed by adding ଟି(Ti) to the adjective ୫୯ (aNasaThi) and this is used for things and animals, where as the word୫୯ଜଣ(aNasaThijaNa) is formed by adding ଜଣ(jaNa) to the adjective ୫୯ (aNasaThi) and this is used for human beings. Likewise the words ୬୫ଟି (pancasaThiTi) and ୬୫ଜଣ (pancasaThijaNa) have been added to the ID no 9488.

  9. In the ID no 4699, the Odia word ସମ୍ପୂର୍ଣ୍ଣଭାବେ(sampu:rNNabha:be) is an appropriate synonym of the word ସମ୍ପୂର୍ଣ୍ଣ(sampu:rNNa) which is the Hindi equivalent of बिल्कुल (bilkul). In the ID no 4620, the Odia word ଏକାଠି(eka:Thi) has also been found an appropriate synonym of the word ଏକାସାଙ୍ଗରେ (eka: sa:ngare)which is the Hindi equivalent of साथ-साथ (sa:thsa:th). The new words have been mentioned in red colour. .

  10. Deleted Synsetsduring Sense-Marking: • Total number of deleted words: 109 • Examples: • ବରଫି, ହିମଯୁକ୍ତ- 3650 • ଆବାଦୀ, ବସ୍ତି, ପଡ଼ା-5483

  11. In the ID no 6468, the words such as ବରଫିand ହିମଯୁକ୍ତ have been deleted. The reason is that the word ବରଫି(barafi) is the name of ‘a type of sweet’ and the word ହିମଯୁକ୍ତ(himajukta) means‘added with snow/ice’ and they do not give the same meaning as the words ‘ବରଫାବୃତ, ବରଫାଚ୍ଛନ୍ନ, ହିମାବୃତ, ହିମାଚ୍ଛନ୍ନ (barafa:bruta, barafa:chana, hima:bruta, hima:chana)’ which mean ‘covered with snow/ice’. So both the words have been deleted. In the Id no 28602, the word ଆବାଦୀ(a:ba:di:) means ‘cultivable’ in Odia. Again the words ବସ୍ତି(basti) and ପଡ଼ା(paDa:) mean a ‘hamlet’. These words give the same meaning as the word (janasaMkhya:) in Hindi, but they do not give the same meaning as the words such as ଜନସଂଖ୍ୟା(janasaMkhya:) and ଲୋକସଂଖ୍ୟା(lokasaMkhya:) mean in Odia. So These words have been deleted.

  12. Antonym Tool Report: • Antonyms using the Antonym Creation Tool developed by Thapar University have been checked and the related issues have been discussed.

  13. Example 1: • ID -1471, CAT-NOUN • CONCEPT:: एक पौधे का बीज जिससे तेल निकलता है • EXAMPLE :: "वह प्रतिदिन नहाने के बाद तिल का तेल लगाता है" • SYNSET-HINDI :: तिल,पूतधान्य,साराल • CONCEPT:: ଏକଛୋଟ ଗଛର ମଞ୍ଜି ଯେଉଁଥିରୁ ତେଲ ବାହାରେ • EXAMPLE:: "ସେ ସବୁଦିନେ ଗାଧୋଇସାରି ରାଶି ତେଲ ଲଗାଏ" • SYNSET-ORIYA:: ରାଶି, ଖସା, ତିଳ Antonym ID: • ID-2629, CAT-NOUN • CONCEPT:: वह स्थान जहाँ दवाएँ मिलती या बिकती हों • EXAMPLE:: "निषिद्ध दवा बेंचने के कारण यह औषधालय बंद हो गया" • SYNSET-HINDI:: औषधालय,दवाख़ाना,दवाखाना,औषध दुकान,दवाघर • CONCEPT:: ଯେଉଁ ସ୍ଥାନରେ ଔଷଧ ମିଳେ ବା ବିକ୍ରି ହୁଏ • EXAMPLE:: "ନିଷିଦ୍ଧ ଔଷଧ ବିକ୍ରିକରିବା କାରଣରୁ ସେହି ଔଷଧାଳୟ ବନ୍ଦ ହୋଇଗଲା" • SYNSET-ORIYA:: ଔଷଧାଳୟ, ଔଷଧଦୋକାନ, ଓଷଦ ଦୋକାନ

  14. Discussion: In this example, the Hindi words til (तिल), pu:tdha:ny (पूतधान्य), and sa:ra:l (साराल) which mean ‘gingelly oil plant and seed’ andaushadha:lay (औषधालय), dava:xa:na: (दवाख़ाना), dava:kha:na: (दवाखाना), aushadhduka:n (औषध दुकान), and dava:ghar (दवाघर) which mean ‘a medicine store’ are mentioned as antonymous. The Hindi equivalents for Odia ra:si (ରାଶି), tiLa (ତିଳ), and khasa: (ଖସା) mean ‘gingelly oil plant and seed’ and ausadha:Laya (ଔଷଧାଳୟ), ausadhadoka:na (ଔଷଧଦୋକାନ), and osadadoka:na (ଓଷଦ ଦୋକାନ) ‘ medicine store’ cannot be accepted as the antonyms in Odia.

  15. Example 2: • ID-40, -NOUN • CONCEPT:: मादा गीदड़ • EXAMPLE :: "जंगल में एक गीदड़ी अपने बच्चे को दूध पिला रही थी" • SYNSET-HINDI:: गीदड़ी,सियारिन,सियारनी,जंबुकी,जम्बुकी,शृगाली,शृगालिका,लोपापिका,लोमाशिका, लोमसिक, सृगाली, सृगालिका, शिजवा, वामी • CONCEPT:: ମାଈ ବିଲୁଆ • EXAMPLE:: "ଜଙ୍ଗଲରେ ଗୋଟିଏ ମାଈ ବିଲୁଆ ତା ଛୁଆକୁ କ୍ଷୀର ପିଆଉଥିଲା" • SYNSET-ORIYA:: ମାଈ_ବିଲୁଆ , ମାଈ_ଶିଆଳ, ମାଈ_ଶୃଗାଳ • Antonym of id no 40 is 24 • ID :: 24, CAT-NOUN • CONCEPT:: मादा शेर • EXAMPLE:: "शेरनी शेर से अधिक खूँखार होती है / गुरु भक्त शिवाजी समर्थ गुरु रामदास का पेट दर्द ठीक करने के लिए शेरनी का दूध लाए" • SYNSET-HINDI:: शेरनी,मादा बाघ,मादा व्याघ्र,बाघिन,व्याघ्री • CONCEPT:: ମାଈ ବାଘ • EXAMPLE:: "ବାଘୁଣୀ ବାଘଠାରୁ ଅଧିକ ହିଂସ୍ର/ଗୁରୁଭକ୍ତ ଶିବାଜୀ ସମର୍ଥ ଗୁରୁରାମଦାସଙ୍କ ପେଟ ଯନ୍ତ୍ରଣା ଭଲ କରିବାପାଇଁ ବାଘୁଣୀର କ୍ଷୀର ଆଣିଥିଲେ" • SYNSET-ORIYA:: ବାଘୁଣୀ, ମାଈ ବାଘ

  16. Discussion: • In this example, the Hindi word gi:dDi: (गीदड़ी) is stated as the antonym of sherni: (शेरनी). But in Odia, the equivalents for these two are ma:i: bilua: (ମାଈ_ବିଲୁଆ), ma:i: sia:La (ମାଈ_ଶିଆଳ), and ma:i: sruga:La (ମାଈ_ଶୃଗାଳ) which mean ‘ female jackal’ . These are not usually acceptable as the antonyms of ba:ghuNi: (ବାଘୁଣୀ), and ma:i: ba:gha (ମାଈ_ବାଘ) ‘ female tiger or tigress’.

  17. EXMAPLE-3 : Hindi synset and category wrong • ID-2234, CAT-noun • CONCEPT:: एक वृक्ष जिसके मीठे फूलों से शराब और अन्य खाद्य वस्तुएँ बनती हैं • EXAMPLE:: "महुए की लकड़ी मानव के लिए बहुत उपयोगी होती हैं" • SYNSET-HINDI:: महुआ,मधु,मधुक,महूक,महूख,मधुष्ठील,मधुवृक्ष,मधुशाक,महाद्रुम • ID-2234, CAT-NOUN • CONCEPT:: ଯେଉଁ ବୃକ୍ଷର ମିଠା ଫୁଲରୁ ମଦ ଏବଂ ଅନ୍ୟ ଖାଦ୍ୟ ବସ୍ତୁ ତିଆରି ହୁଏ • EXAMPLE:: "ମହୁଲ କାଠ ମଣିଷପାଇଁ ବହୁତ ଉପଯୋଗୀ ହୋଇଥାଏ" • SYNSET-ORIYA:: ମହୁଲ, ମହୁଆ • Antonym ID: • ID-8378, CAT-adjective • CONCEPT:: सामना होने पर संकोचवश होनेवाला • EXAMPLE:: "वह जब भी मिलता है,मुँह-देखी प्रशंसा करना शुरु कर देता है" • SYNSET-HINDI:: मुँह-देखा • ID-8378, CAT-ADJECTIVE • CONCEPT:: ମୁହାଁମୁହିଁ ହୋଇଗଲେ ମୋହବତିଆ ବ୍ୟବହାର • EXAMPLE:: "ଯେତେବେଳେ ବି ଦେଖାହୁଏ, ସେ ଉପରଠାଉରିଆ ପ୍ରଶଂସା କରିବା ଆରମ୍ଭ କରିଦିଅନ୍ତି" • SYNSET-ORIYA:: ଉପରଠାଉରିଆ, ଉପୁରିଆ

  18. Example 3…

  19. In this example, the Hindi word प्रयोजनहीनतः (prayojanhinataH) is stated as the antonym of प्रयोजनतः (prayojanataH) and these words have been mentioned under the sub-category of ‘adverb of quality’ . But the usage/meaning shows that the words should come under the sub-category of ‘adverb of reason’. Another important fact is that this sub-category ‘adverb of reason’ is not enlisted in the drop down menu. Therefore, the sub-categories of adverbs should be revisited.

  20. Category changes from Action to state:

  21. Category changes from Action to Place :

  22. ଓଡ଼ିଆ ଶବ୍ଦକୁଟୁମ୍ବODIA WORDNET Odia Wordnet website link: http://indradhanush.unigoa.ac.in/odiawordnet/

  23. Home Page

  24. Future Plan: • An Online Dictionary • A Thesaurus

More Related