1 / 33

Some LIRICS topics Peter Wittenburg, Marc Kemps-Snijders MPI for Psycholinguistics 1 hour ?

Some LIRICS topics Peter Wittenburg, Marc Kemps-Snijders MPI for Psycholinguistics 1 hour ?. LMF Topics. is this LIRICS? what is LMF compliance? Some ideas about header and related standards UNICODE docking mechanism of components relation mechanism operation mechanism

inga
Télécharger la présentation

Some LIRICS topics Peter Wittenburg, Marc Kemps-Snijders MPI for Psycholinguistics 1 hour ?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Some LIRICS topicsPeter Wittenburg, Marc Kemps-SnijdersMPI for Psycholinguistics 1 hour ?

  2. LMF Topics • is this LIRICS? • what is LMF compliance? Some ideas about header • and related standards UNICODE • docking mechanism of components • relation mechanism • operation mechanism • this is LIRICS for MPI and Sheffield primarily • DCR Syntax API • LMF API • lexicon and component registries incl metadata

  3. DCR API 1 • tools such as GATE, LEXUS, ANNEX, … have to make use of • the ISO DCR (and probably other ontologies/concept registries) • so we need an API (that can also be re-used) • API has to be delivered finally as a web-service including all aspects • UDDI layer to search/browse for services • WSDL layer to describe the interface (methods, …) • SOAP layer for message exchange • SYNTAX ISO DCR was not set up as a service, but as a management • tool for editing boards • therefore a split in final API (the ideal) and first phase API • in 1. phase: no webservice, URL and details of services are known • all what comes is a result of a smooth interaction with the LORIA folks

  4. DCR API 2 • function List loadProfiles () • give me all profiles in DCR • function List loadDataCategories (aProfile, aWorkingLanguage, aObjectLanguage) • give me all datcats for a certain profile; result is perhaps a structure • function List searchDataCategories (aQueryString, aProfile, aWorkingLanguage, • aObjectLanguage) • search for a datcat by specifying some pattern – mostly a name • function DataCategory loadDataCategoryReduced (URID, aWorkingLanguage, • aObjectLanguage) • give me some info for a certain datcat (ID, definition, conceptual domain) • function DataCategory loadDataCategory (URID, a WorkingLanguage, • aObjectLanguage) • give me all info for a certain datcat for specified languages • function List loadAllTopBroaderGenericConcepts () • give me all top conceptual domains • function List giveLinks (aDataCategory) • give me additional information such as constraints (to be worked out!)

  5. DCR API 3 • function DataCategory loadBroaderGenericConceptDataCategory • (aDataCategory/URID) • give me a broader concept for a name or URID • function List loadDataCategoriesUsingBroaderGenericConcept (aDataCategory) • give me all datcats for a broader concept • function Workspace openWorkspace (aUserName, aUserLogin) • open a private workspace for a user • function DataCategory AddDataCategoryToWorkspace (aWorkspace, • aDataCategory, aStatus) • add a datcat to a workspace • login into the system • synchronize with a given cash

  6. LMF Registries General • an LMF API makes only sense if you have a service • a service makes only sense if you can serve something • what can LMF services serve: • lexical schemas • extensions i.e. ready-made components created by someone • LMF compliant lexica created by someone • other lexica related information • so we need registration services • MPI will start doing so since we need it now • will set things up similar to IMDI • all is open (registries and portal code) • everyone can easily setup his/her own portal • will and have to synch about various things • perhaps people will like it

  7. LMF Registries • registries must give the following services • register and store a lexical schema • register and store a lexical component schema • register (and store) a lexicon (storage can be everywhere) • delete an entry • modify an entry • let the user browse or search for lexica metadata based • let the user browse or search for schemas metadata based • for metadata start we suggest the stuff that came out of the discussions in • ISLE/MILE (see IMDI)

  8. LMF API 1 • if we have found a lexicon what then … • services based on web-services • UDDI level – why not the ISLE/MILE stuff • function LexicalDatabase createLexicalDatabase (Name) • create a lexicon in a workspace • function LexicalDatabase loadLexicalDatabase (URID) • give me a certain lexicon (into workspace or local?) • function LexicalDatabase loadLexicalDatabaseDetail (URID, aStructure) • give me a certain lexicon part (filtering into workspace or local?) • function void storeLexicalDatabase (LexicalDatabase) • store/upload a lexical database • function LexicalEntry createLexicalEntry (URID, LexicalDatabase) • create/add a lexical entry • function LexicalEntry loadLexicalEntry (URID) • upload a lexical entry • function void storeLexicalEntry (URID, LexicalEntry) • update a lexical entry

  9. LMF API 2 • function List searchLexicalEntries (aQuery) • search for lexical entries matching the string unstructured • function List searchLinguisticInformationUnits (aQuery, aStructure) • search for lexical entries matching the string on specific attributes • returning substructures (filtering) • In addition • function LexicalDatabaseSchema loadSchema (URID) • give me a schema for a specific lexicon • function void store (LexicalDatabaseSchema) • store and register a schema • function GlobalInfo loadGlobalInfo (URID) • give me the metadata/globalInfo for a lexicon • function void storeGlobalInfo (GlobalInfo) • store and register a lexicon with metadata • function List searchLexicalDatabase (aQuery) • search for lexica based on metadata

  10. LMF API 3 • what about • relations • where to store relations – need registry mechanism • if there is one integrated domain of lexica relations can be registered under this common root • Gil: took UML – UML has everything in it • so also relations are in UML – so why bother • Peter: where to register is the question • these were just first ideas!! • Monica/Thierry haben ein Tool fuer die Constraints gemacht

  11. actually component association is a relation of special type What else: Relations bank breite Sitzgelegenheit something broad to sit on • need various type of relations between • attributes and units in value strings • each relation can be associated with • features, i.e. relations can be seen as • components in its own sitzgelegenheit etwas um zu sitzen something to sit on schmal gegenteil zu breit contrary to broad

  12. component Y component X relation U 1..1 1..N type = refine from to cardinality relation V component K 1..N type = any from cardinality to cardinality component L 1..N • need a generalized relation mechanism (look in Parole lexica etc) • prefer very simple graphics instead of UML hiding the essentials Relation Mechanism • relation components are almost normal components, • i.e.they can have components and datcats • however they don’t have a parent • do we need the destinction between “to” and “from” • in general relations?? • component reference is a special type of relation • here we need to distinguish “to” and “from” • added a few additional stuff in paper (direction)

  13. What else: conditions (operations) just one example from DOBES lexemtype if lexemtype = “stem | idiom | lexical word” head sense nr outer-body-L meaning if lexemtype = “auxil | inflect affix” etc etc sense nr meaning effect • probably better examples around • if value(X) then modify contraints(Y) • etc categorial effect etc etc

  14. Operation Mechanism • well – nothing special perhaps (operators as datcats – see Gil) • but need sequence of operations • but we need to be able to add complex operations (code) • then need an invocation and interfacing standard

  15. LEXUS etc • we need to go ahead since we have to deliver usable infrastructures • so hope on critical comments and fast convergence  • relation mechanism is next on our action list • LMF API relevant for us since we have to combine LEXUS and LAMUS • LAMUS = Language Archive Management and Upload System • is ready and working for simple objects (annotated media, …) • but need to handle complex objects such as lexica • metadata is done – people can integrate and search for lexica • registries for schemas and sub-schemas comes next as well

  16. LEXUS state • ISO DCR integrated – Shoebox MDF as well, GOLD to come • private and protected workspace is ready • Shoebox/ CHAT filters ready, XML grabbing to come • first cross lexicon search is ready • working on private DCR (stripped but compliant Syntax) • working on Concept Profiles (bottom up generated concept lists) • working on tools to link bottom up stuff with ISO, … • working on easy mapping framework • first interaction with corpora is ready • first merging is integrated • what else??

  17. Logging onto the application Users must authenticate before loggin onto the application. Workshop ‘LexicalDabases and digital tools’ Nijmegen April 2004

  18. User workspace Each user has his/her own personal workspace where private lexica are stored Workshop ‘LexicalDabases and digital tools’ Nijmegen April 2004

  19. Lexicon creation New lexica may be created… Workshop ‘LexicalDabases and digital tools’ Nijmegen April 2004

  20. Lexicon import New lexica may be imported from a lexical resource… Workshop ‘LexicalDabases and digital tools’ Nijmegen April 2004

  21. Lexicon structure The LMF core model can be identified in this simple structure. Components and datacategories can be identified using different icons. All may be dynamically created or modified. Workshop ‘LexicalDabases and digital tools’ Nijmegen April 2004

  22. Lexicon structure Representation of a more complex structure. By selecting a node in the Tree the content of a component or datacategory is shown and may be modified. Workshop ‘LexicalDabases and digital tools’ Nijmegen April 2004

  23. Data category selection Data categories can easily be selected from data category registries. . Workshop ‘LexicalDabases and digital tools’ Nijmegen April 2004

  24. Lexical entry overview Overview of lexical entries. By selecting a lexical entry the details will be revealed. Workshop ‘LexicalDabases and digital tools’ Nijmegen April 2004

  25. Lexical entry details Details of a lexical entry. Entry structure modifications are bound to schema definition, e.g. cardinality. Workshop ‘LexicalDabases and digital tools’ Nijmegen April 2004

  26. Lexical entry details Attribute values can be easily modified. Various value types are supported( text, video, audio, image or file) Workshop ‘LexicalDabases and digital tools’ Nijmegen April 2004

  27. Lexical entry details Example of uploading a video file. Workshop ‘LexicalDabases and digital tools’ Nijmegen April 2004

  28. Lexical entry details Viewing multimedia content. Workshop ‘LexicalDabases and digital tools’ Nijmegen April 2004

  29. Alternative entry view Workshop ‘LexicalDabases and digital tools’ Nijmegen April 2004 Alternative views are provided which may be customized in look and feel.

  30. Synchronization of lexica Personal Workspace Main Lexicon Workshop ‘LexicalDabases and digital tools’ Nijmegen April 2004 Lexica may be copied to and modified in personal workspace

  31. Synchronization of lexica Personal Workspace Main Lexicon Workshop ‘LexicalDabases and digital tools’ Nijmegen April 2004 Lexica may be synchronized with main lexicon

  32. Synchronization of lexica When synchronizing lexica the user is notified of structural changes and is in total control of the synchronization proces. Workshop ‘LexicalDabases and digital tools’ Nijmegen April 2004

  33. Future directions • Support for various types of relations • Import of data from other sources • Support for other Data Category Registries, e.g. GOLD • Integration with MPI archive • Integration with exploitation tools (ELAN, ANNEX) • Miscellaneous user requests Workshop ‘LexicalDabases and digital tools’ Nijmegen April 2004

More Related