html5-img
1 / 8

Making an Electronic Text

Making an Electronic Text. An exercise in preservation and applied technology. Charles Hindley’s Curiosities of Street Literature. Published in 1871 only 456 copies printed This book is a collection of broadsides, ballads, and popular stories in Dickensian London. What we are doing.

georgina
Télécharger la présentation

Making an Electronic Text

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Making an Electronic Text • An exercise in preservation and applied technology

  2. Charles Hindley’s Curiosities of Street Literature • Published in 1871 • only 456 copies printed • This book is a collection of broadsides, ballads, and popular stories in Dickensian London

  3. What we are doing • Using High quality scanned images and OCR software we have created text documents from the scanned images • Using XML we are then able to “Mark-up” the documents for display on the web. • We are following a defined standard for electronic texts. The TEI, or Text Encoding Initiative.

  4. Text Encoding Initiative • This standard was defined by the University of Oxford, Brown University, University of Bergen, and the University of Virginia • TEI consortium formulated their guidelines to facilitate interchange between individuals and groups using different programs and computer systems over a broad range of applications

  5. To make the TEI defined documents as accessible as possible a cross platform mark-up language was chosen • A mark-up language can be as simple as HTML (Hyper Text Mark-up Language) • As complex as LaTeX • As user definable as XML (eXtensible Mark-up Language)

  6. XML Why it’s good for you • eXtensible Mark-up Language • Chosen By TEI for it’s cross platform, multi-application capabilities. • The user defines the mark-up in XML • custom tag and search XML documents based on those tags

  7. The Images • Each image, scanned saves as a 40 Megabyte uncompressed TIFF • Using OCR (optical character recognition) software, we are able to preserve the text.

  8. The Text • Once the image has been OCR’ed, a text document is created • these text documents can then be marked up in XML • Markup can be done is software or manually

More Related