1 / 14

ATLAS Demystified: A Practical Introduction

ATLAS Demystified: A Practical Introduction. Christophe Laprun, Jonathan Fiscus , John Garofolo, Sylvain Pajot National Institute of Standards and Technology May 31, 2002 Annotation Frameworks and Tools, LREC 2002. Overview. ATLAS = Architecture and Tools for Linguistic Analysis Systems.

essien
Télécharger la présentation

ATLAS Demystified: A Practical Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ATLAS Demystified:A Practical Introduction Christophe Laprun, Jonathan Fiscus, John Garofolo, Sylvain Pajot National Institute of Standards and Technology May 31, 2002 Annotation Frameworks and Tools, LREC 2002

  2. Overview • ATLAS = Architecture and Tools for Linguistic Analysis Systems • Goal: To make  == . • We need to examine more annotation tasks Universe Linguistic Annotation Universe  ATLAS-describable Universe  generated by the ATLAS ontology: “An annotation is the fundamental act of associating some content to a region in a signal ”

  3. Brief History • Started with Bird and Liberman’s Annotation Graphs (AGs) • ATLAS working group formed to explore AG concept • LDC, MITRE and NIST • Introduced at LREC 2000 • Since LREC 2000: • LDC pursued Annotation Graph implementation • To satisfy immediate annotation needs for speech and text • Developed AGTK • Optimized for annotation of linear signals • NIST pursued generalized ATLAS model • 2001 - Multidimensional signals • 2002 - Type support, explicit support for hierarchies

  4. Motivation for Generalization • A long-term solution was needed • Linguistic research is rapidly moving beyond linear signals • Multi-modal  complex signals with varying dimensionality • NIST Meeting Room data includes speech, video, and whiteboard interaction • Automatic Content Extraction (ACE) program includes extraction from speech, text and image data. • Gesture annotation ideally involves 3-dimensional space over time • …

  5. Additional Needs Addressed During Generalization • Type definition support • Define the content, structure and relationships between annotations • Dual use: provides corpus design definition to framework and users • Hierarchical dependencies abound • Sentences are composed of words which are composed of phones, co-reference, parse trees, etc. • AGs do not explicitly express dependencies • Ubiquitous annotation validation • Happens at every stage of data manipulation: creation, modification and filtering • Syntax checking is only the first step

  6. What We Have Accomplished • The core ATLAS annotation ontology • Type definition infrastructure • Developer framework

  7. The Core ATLAS Ontology(Simple Speech Use Case) Task: Annotate sentences which are composed of words Children Sentence Annot. Children Annotation Word Annot. She Content had Region Interval Region Anchor Offset Anchor Offset Anchor Signal audio

  8. The Core ATLAS Ontology(Simple Gesture Use Case) Gesture Region Interval 3DSegment Forearm Annotation Frame Anchor Frame Anchor XYZ Anchor XYZ Anchor

  9. Type Definition Infrastructure • Meta Annotation Infrastructure for ATLAS (MAIA) • Provides mechanism for the definition and use of annotations at the semantic level • Specifies content, structure and relationships between annotations • Sufficiently expressive for validation • Users declare their types via XML • no coding required • Framework generates and uses type constructs from the definition dynamically • Validation occurs automatically

  10. Type Definition Excerpt <AnnotationType name='sentence'> <AllowedChildren containedType=‘word'> <DefinesRegionAs/> <DefinesContentAs/> </AllowedChildren> </AnnotationType> <AnnotationType name=‘word'> <RegionType ref='interval'/> <ContentType ref=‘wordContent'/> </AnnotationType> <ContentType name=‘wordContent'> <ParameterType ref='string' role='text'/> </ContentType> Sentence Annot. Children Word Annot. WordContent Interval Region

  11. Developer Framework jATLAS: a Java implementation • Core suite of objects: • Implements ATLAS’ generic annotation ontology • Defines an Application Programming Interface (API) • Low-level services: • Data import/export, management utilities • Defines a Service Provider Interface (SPI) to allow advanced framework extensions • additional persistence forms • Automatic validation services via MAIA

  12. ATLAS Status • Stable ontology • Basic typing services via MAIA • Developer framework: jATLAS in Beta version • Has dramatically reduced development times for NIST prototype applications • Persistence format: ATLAS Interchange Format (AIF) • ACE format import • AG format import partially supported • Active development • Public domain source code, freely available

  13. ATLAS Future Work • MAIA extensions • Type inheritance • Increased structural validation • Content-based validation • Framework extensions • Currently developing a GUI component framework • Tool development • Annotation and evaluation tools at NIST • Collaboration with other sites • Contributed tools repository

  14. More information? • http://www.nist.gov/speech/atlas • We welcome feedback, • comments and suggestions

More Related