140 likes | 247 Vues
In this presentation, David Lewis of Clowes Information addresses the often-overlooked printing industry as a vital, albeit Cinderella-like, player in the publishing realm. He explores the evolution of structured document creation from typesetting to XML databases, the challenges authors face in maintaining document structure, and the complexities of converting legacy data. Emphasizing the critical role printers play in bridging the gap between traditional methods and modern technology, he underscores the necessity of embracing structured editing and the printer's pivotal contribution to managing transitions in publishing.
E N D
XML in PublishingThe Printer’s Perspective A Presentation by David Lewis of Clowes Information
Introduction • Printing is seen as a Cinderella Industry - but the book remains a main source of revenue • Printers are not Luddites – they do have reason to be cynics
The Early Days of Structured Document Creation • Typesetters have always wanted to code documents • Early SGML projects originated by typesetters • Typesetters were obsessed with keystrokes • contributed to tag minimization • Always normalise SGML before use • Sequence used to be Typeset then convert to SGML • It has now become Create SGML then typeset
The Loss of Structure • Authors originate in MS Wordbut do not add structure • The move to new media publishing given to IT dept • Cannot go from legacy data to structured data via HTML • DTP made everyone an expert – structure disappeared
Publishing Databases • The idea of typesetting from databases is not new • Supplied as fielded or separated lists. • Not easy to change this interface • New databases have emulated old interfaces • Relational model is not ideal for free text • often sets of Word files managed by RDBMS • Is it time to do it properly – use an XML database? • Hardest task of all is to get editorial staff to use structured editing software
DTD Creation • Can use Docbook – it tries to be all things to all men • There is often missing metadata for a book page • Books require many types of space, so use the entities • Even better - make an element group to keep items under positional control • Groups also facilitate finding the last in a list – a, b, c, d and e
DTD Creation • Multiple references to a float • we need one that calls in the picture • References to graphics • OK to rotate landscape tables/graphics? • are there rules for sizing to fit the page layout? • Is specific text needed for headline or footline • Ambiguity of quoted paras within paras • A group may resolve the ambiguity
Creating XML • Legacy conversion • Extra workload on editorial staff • Fallback is re-keying • Smart conversion from existing files avoids introducing new errors • Conversion from SGML • Data files generally not too difficult • DTD conversion more difficult • Do not attempt the conversion at a critical time
Creating XML • Authoring XML • Unfortunately most publishers and authors want to use MS Word • Word look-alike approach • Database entry look-alike approach • Multi window views of the same document • Spaces are important • Use text editor to know exactly what is in the XML
Page Makeup • Makeup for a printed page is more demanding than for electronic delivery • There is a much higher expectation of quality on a printed page • Printed page has width and depth constraints • Floating objects share space with the text flow • Need to fill as much white space as possible to minimise the number of pages
Page Makeup • From XML it should be as automatic as possible • XSL does not yet have print page features (if ever) • XSL defines the rules – page makeup needs ways to solve the problems caused by the rules. • Designers should consider potential page break problems caused by the design • Publishers/Editors always seem to do corrections when they see pages
The Printer’s Contribution • There is a high turnover of Publisher’s staff, so Printers are often the only continuity • Trusted bythe current production team • Can best manage the transition from legacy to structured data • Printer has the best knowledge of the data,therefore well placed to lead DTD design. • Printer can be best interface to “XML experts”
The Printer’s Contribution • Printers have set up new media operations, but often disguise the association by a change of name • Does the promised inclusion of XML within PDF bring the wheel full circle? • Spec for PDF 1.4 includes XML metadataeg PPML for personalization
And Finally Remember . . . Cinderella DID go to the balland DID marry the Prince