1 / 7

Enhancing Language Technology with GATE and the Semantic Web

This document provides an overview of GATE (General Architecture for Text Engineering) and its role in advancing the integration of language technology and the Semantic Web. It discusses the necessity for a ubiquitous, permeable, and companionable web, emphasizing machine-processable data and the concept of critical mass in semantics. Key features of GATE include its macro-level architecture, development environment, and tools for information extraction, evaluation, and visualisation. This resource is intended for language engineers and researchers aiming to leverage GATE in their projects.

fabian
Télécharger la présentation

Enhancing Language Technology with GATE and the Semantic Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GATE and the Semantic Web • Hamish Cunningham, Kalina Bontcheva, Wim Peters,Marin Dimitrov1, Atanas Kiryakov1, Department of Computer Science, University of Sheffield1OntoText Lab, Sirma AI Ltd. • Brief intro to GATE (a General Architecture for Text Engineering), • Hand waving about LT and the Semantic Web, • Demo 1(7)

  2. A Ubiquitous Permeable Web • The next generation of the web must be: • ubiquitous: semantics for every device, every organisation, every individual; • permeable: allow contextual data to penetrate and persist; • companionable: able to engage with us via multiple natural modalities. • Roles for Language Technology: • discovery of semantics (ubiquity); • mediating between context and personal semantic memories (permeability); • conversing with people and the semantic web (companionableness). 2(7)

  3. Critical Mass for the Semantic Web • The SW: machine processable, repurposable data to compliment hypertext • But: semantics = 0.0000000...% of the Web • How to achieve critical mass? Huge scale automatic annotation. Requirements: • Huge scale:– freely available to all EU citizens– distributed (over a Grid)– re-purposeable (delivered as Web Services) • Portability and robustness via:– simple and therefore shallow HLT methods– +ve and –ve learning– analogs of IPSEs for computer-literate users 3(7)

  4. GATE is: • An architectureA macro-level organisational picture for LE software systems. • A frameworkFor programmers, GATE is an object-oriented class library that implements the architecture. • A development environmentFor language engineers, computational linguists et al, GATE is a graphical development environment bundled with a set of tools for doing e.g. Information Extraction. • Some free components... ...and wrappers for other people's components • Tools for: evaluation; visualisation/edit; persistence; IR; IE; dialogue; ontologies; etc. • Free software (LGPL). Download at http://gate.ac.uk/download/ 4(5)

  5. Architectural principles • Non-prescriptive, theory neutral (strength and weakness) • Re-use, interoperation, not reimplementation (e.g. v1 used LT-NSL for SGML input; v2 talks to other XML-based systems, APIs and standards) • (Almost) everything is a component, and component sets are user-extendable • Component-based development • An OO way of chunking software: Java Beans • GATE components: CREOLE = modified Java Beans (Collection of REusable Objects for Language Engineering) • The minimal component = 10 lines of Java, 10 lines of XML, 1 URL. 5(7)

  6. Displaying Multilingual Data • All the visualisation and editing tools for ML LRs use enhanced Java facilities: 6(7)

  7. GATE demo • Components and the main UI; the resources tree • Document formats, databases • IE, IR, annotation, evaluation, WordNet • Ontologies, OntoGazetteer, Protégé, DAML export 7(7)

More Related