1 / 8

AnCoraPipe : A tool for multilevel annotation

AnCoraPipe : A tool for multilevel annotation. Manu Bertran, Bàrbara Soriano, Oriol Borrega, Marta Recasens Universitat de Barcelona CBA 2008. Contents Data format Annotation interface Installation Description Future improvements. Data format.

Télécharger la présentation

AnCoraPipe : A tool for multilevel annotation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AnCoraPipe: A tool for multilevel annotation Manu Bertran, Bàrbara Soriano, Oriol Borrega, Marta Recasens Universitat de Barcelona CBA 2008

  2. Contents • Data format • Annotation interface • Installation • Description • Future improvements

  3. Data format • Data are stored in UTF-8 encoded XML format. • Design principles: • Reduced inventory of node names. • Attributes are atomic. • Attributes describe only the node they depend of. • There is no redundancy in the data. • Adding new annotation levels/values is fast and easy. • Annotation time has been reduced by a whole 50%.

  4. Annotation interface • Installation requirements: • Java 1.5 o higher. • SWT Java graphic library (included in our package for Windows XP). • Otherwise, the graphic library can be obtained with the free Eclipse package.

  5. Annotation interface • Description • The interface is organized in a series of screens where specific data for each annotation level are shown. • The interface highlights all nodes capable of being annotated, and the sentences which have not been marked yet, in order to make the annotator’s work easier.

  6. Annotation interface • The system allows for the addition of external tools for specific annotation levels: • WordNet • Coreference

  7. Future improvements • Making the tool available from the Internet, adapting it to Linux and Mac environments. • Implementing corpus query methods from the interface. • Implementing statistical corpus description methods. • Adding tools to handle verbal and nominal lexicons. • Adding semiautomatic methods and machine learning functions for the partial annotation of corpora.

  8. AnCora • http://clic.ub.edu/ • http://clic.ub.edu/ancora/ • http://clic.ub.edu/mbertran/tbfeditor/

More Related