1 / 36

Zanichelli XML-based Dictionaries Editing System

Zanichelli XML-based Dictionaries Editing System. Daniele Fusi. 1 - System Requirements. Multiple presentations, legacy content, operating environment. One content, multiple presentations. cd-rom / dvd. web sites or services. data. e-books. paper books.

jolie
Télécharger la présentation

Zanichelli XML-based Dictionaries Editing System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Zanichelli XML-basedDictionaries Editing System Daniele Fusi

  2. 1 - System Requirements Multiple presentations, legacy content, operating environment

  3. One content, multiple presentations cd-rom / dvd web sites or services data e-books paper books

  4. Existing environment: requirements authors • accustomed toWYSIWYG editing in Word processors • no technical training editors IT point of view • content validation and uniformation • text-based tools • simple content structure • text as a database • query and interactivity • multiple media and forms designers • DTP pagination • flattened structure • import / export

  5. Existing content: conversion word processor documents 3rd party formats

  6. Digital format requirements • text-based storage, both machine- and user-readable • using standard technologies (portable & durable) • open to expansion and customization • easy to manipulate • easy to transform for import/export • focused on semantics: content rather than its presentation

  7. Content and semantics: dictionary lemma: στόμα ...μα

  8. Marking semantics in text: ‘fields’ lemma morphology etymon translation sample work etc ...

  9. Semantic markup: applications lemma alphabetical lemmata list, normal or inverted morphology list of lemmata grouped by grammatical category etymon list of lemmata grouped by etymon (roots dictionary) translation rudimentary bidirectional dictionary sample look for quotation work list of quoted works and authors etc ... lemma morphology complex searches etymon work etc ...

  10. 2 – Solution overview XML-based implementation

  11. Implementation: XML XML Dictionary • Unicode text files • widely used standard • built for openness and transformation (XSLT) • representation of any kind of data, independently from their presentation • hierarchical model • well-fit to hierarchical model: letter, lemma, fields • typically stored as text for existing works dictionary letter lemma field

  12. Sample: lemma and fields • lemma = dizionário • date = 1965 • grammar = s.m. • translation 1 = complesso dei lemmi di un dizionario e sim. • separator • translation 2 = lista dei lemmi dizionário[1965]s.m. complesso dei lemmi di un dizionario e sim. • lista dei lemmi

  13. Hierarchical structure dictionary lemma lemmata... letters... date grammar lemma translation separator letter translation

  14. Minimalist structure Flat, yet extensible • smallest depth satisfies practical requirements • fields vary at will accor-ding to the dictionary language and type • variability of fields compensates for relatively flat hierarchy

  15. Structure and compromises Focus on semantics Practical devices • fields define lemma parts: etymon, translation, grammar, samples, ... • formatting is automatically derived from semantic structure (lemma = bold, grammar = italic, author = smallcaps, ...) • text escapes define specific formatting for portions of field values, whenever they are not considered as semantically relevant I came by cab 1 field (sample) in lemma: hierarchy needs not to be deeper, yet allow emphasis on “by”

  16. Storage: data • XML files: one file per letter • each dictionary has its own alphabet and sorting scheme • lemmata: automatically inserted in the proper file and at the proper position according to their content • lemma ID overriding for special sorting XML files (letters) lemma à côté (du) acote ABS abiesse 10 minutes tenminutes

  17. Storage: metadata • self-descriptive dictionary: additional XML files define: • fields list and types within each dictionary • alphabet and sort order for each dictionary, including diacritics sensitivity • other support dictionary-specific resources (e.g. frequently typed symbols, preview styles) prelemma etymon abbreviation phonetics translation variant grammar category (A, B, C...) section (1, 2, 3...) separator (– ∎ ∙ ∘ ⋆...) ... abcčćddžđefghijklljmnnjoprsštuvzž croatian

  18. 3 – Editing Authors

  19. Visual Editing • visual UI: • authors build lemmata visually by blocks, and are shielded from underlying XML code • XML code integrity is granted by software • typographical preview is provided for WYSIWYG accustomed authors XML data XML metadata file = letter letter fields lemmata

  20. Editing software: editing by blocks letterselector visual editing: fields in lemma lemmata list typographicalpreview

  21. Editing in distributed scenarios Web based visual editing

  22. Web: distributed scenario • dictionaries are stored centrally in a web server • an ASP.NET web site manages accesses and versioning for different authors and works • visual editing implemented as a Silverlight RIA, running from authors own computer, yet inside a web page: • desktop-class responsiveness for application • true platform independence (Mac / PC, IE / Mozilla / Safari) • no need for software distribution and installation • centralized software maintenance

  23. Distributed editing database formanaging access SQL author XML specialized author ASP.NET serverapplication managesusers and worksversions editor Silverlight application runs onclient computer for visual editing

  24. Visual editing in your web browser letterselector visual editing: fields in lemma lemmata list typographicalpreview

  25. 4 - Revision Editors

  26. Content revisions and transformations merging differentversions (multipleauthors scenarios) editors validationand uniformation DTP paginationfor printing

  27. Automated revision and correction test selection test description results

  28. 5 - Publication Editors

  29. One content, multiple outputs web sites mobile devices (Mobipocket) print cd/dvd

  30. Extending the model Sample: RTL languages and root-based dictionaries

  31. Arabic-Italian dictionary • clashing RTL/LTR text flows • special alphabetical order: • several letters share the same rank • different sorting according to level • root-based dictionary: • letter • root • lemma • field • existing dictionaries structure must be kept unchanged even if a deeper hierarchy would be required 1 +0621 ... +0627 2 +0628 3 +062A ... roots are sorted according to predefined scheme,lemmata in roots are arbitrarily sorted by authors

  32. Hierarchy depths Other dictionaries Arabic: roots item = root = set of fields, some delimiting lemmata boundaries item = lemma = set of fields letter lemma ... ... ... item = lemma item = root lemma ... ...

  33. Deeper hierarchy illusion: special editor Arabic-X editor • XML structure unchanged: each file is a letter containing items, each item contains fields • items are roots, not lemmata • a special field defines lemmata boundaries whithin each root • user sees letters, roots, lemmata in root, fields in lemmata; XML structure remains letter-items-fields lemma ... lemma ...

  34. Specialized editor: Arabic letterselector visual editing: fields in lemma roots in letter typographicalpreview, bidirectional flows lemmata in root

  35. Arabic editor trick: advantages • user experience is almost unchanged (there are 2 lists instead of 1 to choose from for editing, roots and lemmata) • XML structure unchanged: all the other editorial processes require no change so that the new dictionary fits into them easily • fields variability (already responsible for structure expandability) makes this trick possible one model, several views

  36. Daniele Fusi http://www.fusisoft.it daniele.fusi@uniroma1.it

More Related