Software Applications for Processing Romanian Texts: Overview and Demonstration
This presentation showcases various software applications designed for processing Romanian texts, including the Romanian Morphological Dictionary (DMR) developed by ITC S.A. and RoLingva. It covers tools like LEXICON for updating lexical entry attributes, SIASTRO-AM for phrase analysis, and ETR for term extraction in specialized texts. The applications provide features such as morphological analysis, syntactical parsing, and a user-friendly interface to collect data from multiple users. The demonstration highlights the capabilities of these software tools in enhancing linguistic research and text processing.
Software Applications for Processing Romanian Texts: Overview and Demonstration
E N D
Presentation Transcript
Software Applications for Processing Romanian Texts. Demonstration and Comparison Sanda Cherata Babeş-Bolyai University Faculty of Letters
Software Applications • The Romanian Morphological Dictionary (DMR) – Software ITC SA – RoLingva www.rolingva.ro • LEXICON – for updating attributes in lexical entries • SIASTRO-AM – phrase analysis of noun, adjective, adverb, verb and prepositional phrases • ETR – term extractor for Romanian specialised texts
DMR • Paradigm of a given lemma • classic form • stem + termination • Accents • Syllabification • Morphological analysis of a given word
Software Applications • The Romanian Morphological Dictionary (DMR) – Software ITC SA – RoLingva www.rolingva.ro • LEXICON – for updating attributes in lexical entries • SIASTRO-AM – phrase analysis of noun, adjective, adverb, verb and prepositional phrases • ETR – term extractor for Romanian specialised texts
LEXICON • Specifying attributes for lexico-morphological classes • Designed to collect data from multiple users • Friendly interface
Software Applications • The Romanian Morphological Dictionary (DMR) – Software ITC SA – RoLingva www.rolingva.ro • LEXICON – for updating attributes in lexical entries • SIASTRO-AM – phrase analysis of noun, adjective, adverb, verb and prepositional phrases • ETR – term extractor for Romanian specialised texts
SIASTRO-AM • Lexico-morphological analysis • Parsing of noun, adjective, adverb, verb and prepositional phrases • Uses a lexicon based on DMR, enriched with new lexical and syntactic attributes added with the LEXICON application • Outputs an annotated text
{F – Start sentence sentence F} – Endsentence {C – Startword word C} – Endword {N – Startunknown word unknown word N} – Endunknown word {D – Start number number D} – End number SIASTRO-AMTags for text elements {S – Start punctuation sign punctuation sign S} – End punctuation sign {L – Start hyphen - L} – End hyphen {I – Start ignored sequence sequence I}– End ignored sequence
SIASTRO-AMTags for words {C word (part of speech + grammatical category + grammatical category + ...... , separates parts of speech + grammatical category + grammatical category + ...... )syllabification+accent position: , separates homographs (.......) , ....... (......) syllabification+ accent position:+ lemma +: ...... C} {C date (vrb+p_fp+, sbt+fdpn+fisn+fipn+fvpa+, adj+fdpn+fisn+fipn+fvpa+ ) da-te+2:+da+:+dată+:+dat+: C}
Software Applications • The Romanian Morphological Dictionary (DMR) – Software ITC SA – RoLingva www.rolingva.ro • LEXICON – for updating attributes in lexical entries • SIASTRO-AM – phrase analysis of noun, adjective, adverb, verb and prepositional phrases • ETR – term extractor for Romanian specialised texts
ETR – Future Developments • Syntactical analysis • Enriching the terminological form by adding new terminological features