90 likes | 219 Vues
The LANCHART Project features a sophisticated search engine and data format system designed for digitalized transcription recordings in WAV format. This project integrates MySQL for efficient data management, supporting analytical coding with tools like Praat and TextGrid. Key functionalities include automatic conversions, imports, and a user-friendly web service for seamless access. The engine accommodates various transcription formats, ensuring robust querying capabilities. With nightly updates and XML-based superformat architecture, LANCHART is adaptable to new data analysis needs.
E N D
Data formats in the LANCHART Project transcription Recording (digitalization) wav-file Transcription: Transcriber wav-file & trs-file Analytical coding: Praat wav-file & TextGrid Searching & counting: The LANCHART search engine MySQL database automatic conversion automatic import
The Praat TextGrid participant tier name tier hej med dig ortografi (AMF) tier-group tier host events (AMF) ortografi (XJM) hejsa tiergroup events (XJM)
What a basic search engine does sådan noget man kan når det er ens farmors ortografi (AMF) G AS DS SB RH grammatik (AMF) R AN
The job for the LANCHART search engine match overlapping match kunne du ligge og dø hvor ingen opdagede det ortografi (AMF) G AS DS SA RJ grammatik (AMF) ordstil (AMF) L FAO OB Common tier genre Ggr
The LANCHART search engine • A WebService • JSP / Servlets + front-end JavaScript http://dgcssintranet/search.jsp • A Database Engine, MySQL Search engine: • Highly normalized to eliminate redundancy • Updated every night from Korpus
Support for multiple transcription & analysis formats • Conversions are done using a XML-based `super’- format, so that new formats can be added by creating conversion programmes
Support for multiple transcription & analysis formats CLAN/.Chat • ’Superformat’ is XML-based allowing for XSL Transformations for conversion • Programmed in Java for portability Superformat Praat/.TextGrid Transcriber/.trs Other formats