1 / 27

D-square (D-kwadraat)

This project aims to create digital databases and tools for Dutch dialect dictionaries, including historical background, conversion processes, new encoding, and user access to data.

jenik
Télécharger la présentation

D-square (D-kwadraat)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. D-square(D-kwadraat) Digital Databases and Tools for Dutch Dialect Dictionaries Jos Swanenberg, Folkert de Vriend & Roeland van Hout

  2. Topics • Historical background • Overview of project phases • Conversion procedures • New encoding for data • End user access to the data

  3. Macro structure WBD & WLD Volumes • Agricultural terminology • Other technical or craft terminologies • Common vocabulary

  4. Micro structure WBD & WLD Constituents: • Lexical meaning (title, description of the concept) • Lexical form (‘dutchified’ entry) • Phonetic form • Sources - Geographical code (+ map)

  5. WBD & WLD Example of WLD, volume 1:

  6. History of automation 1960-1980 Filing cards 1985-1995 Word processor, Genoveva 1995-2007 Databases + word proc. 2002 Online database WBD 2003 -2007 D-square

  7. WBD & WLD Filing cards:

  8. WBD & WLD Example of WLD, volume 1:

  9. Online database WBD www.ru.nl/dialect

  10. Example from database: “Meikever” (Eng: “maybug”)

  11. Example of WBD, volume 3

  12. Online database, query

  13. Online database, query result

  14. Raw data FileM Pro Edited data XML Raw data Questionnaires Nijmegen and Leuven Questionnaires (chiefly) Meertens (parts of) Vol. I+II MS-Word Vol. III FileM Pro Enriched data XML Vol. I+II MacWrite Deel III MS-Word Vol. III Filing cards Online DB WBD (Polderland) Edited data Vol. I + II Vol. III Website WBD/WLD with tools for searching and cartography Specialized print editions (dialect atlas or local dictionary) SGV on CD (Polderland)

  15. Overview phases D-square • Conversion to a new format • End user access to data • Enrichment of data • Data management

  16. Phase 1: Conversion to a new format

  17. Reasoning behind new encoding • XML, not relational database • Tailored to WBD and WLD • Flexible enough to be used for other dialect dictionaries • Based on standard: LMF (ISO TC 37/SC 4)

  18. Example from WBD, meikever

  19. Example from database: “Meikever” (Eng: “maybug”)

  20. Example XML-encoding <LEXICON dialect="Brabants"> <ENTRY> <META> … </META> <CONCEPT lang=“dutch” ontol_id=“492”>Meikever</CONCEPT> <DATA> <VARIANT type=“heteronym”>Bakkertje <VARIANT type=“lexical”>bakkerke <VARIANT type=“raw” import=“diplomatic”>bakkərkə <LOCATION source1=“N83”>K 178</LOCATION> </VARIANT> </VARIANT> </VARIANT> </DATA> </ENTRY> … </LEXICON>

  21. Example from WALD

  22. Example from dictionary of the dialects of Zeeland

  23. Phase 2: end user access to data

  24. Small scale survey • - Tools: Search engine, Cartographic tool, Format conversions. • Enrichment: POS, morphemes (syllables) • - Links to other resources: Other dictionaries, questionnaires, FAND, MAND.

  25. Difficulties to overcome • Search engine • Getting from question to query (coaching needed). Is SmartMatch (fuzzy matching) helpful in this regard? • Speed of XML searching • Cartography • Availability of base maps • Links to other resources • Differences in interpretation

  26. Information about D-square www.ru.nl/dialect

  27. Questions?

More Related