200 likes | 340 Vues
This document explores methods to enhance Stata's output capabilities by utilizing Open Document Format (ODF) XML. While Stata excels in statistics, graphics, and data management, its text output has limitations such as lack of pagination, formatting issues, and no Unicode support. By leveraging ODF, we can generate well-structured output files compatible with tools like OpenOffice. This guide includes practical examples and coding techniques to create effective XML content for better data presentation, ensuring compatibility and improved readability.
E N D
Improving the outputcapabilities of Stata withOpen Document Format xml Adam Jacobs Dianthus Medical Limited
Stata’s 3-fold capabilities • Statistics • Graphics • Data management
Text output • A recent clinical study: • 92 pages of raw data listings • 124 pages of descriptive data tabulations • 3 pages of statistical analysis • All from a study in 12 healthy volunteers
Problems with Stata’s text output • No pagination • No formatting (or limited formatting with smcl) • Variable labels not always shown • No Unicode support • No tables of contents • etc etc
Open Document Format • An open standard, approved by ISO • XML based • For a variety of office-type documents • Used by the popular open-source office suite OpenOffice.org • Here, we are just interested in word-processing documents
.odt files • A .odt file is the native file format of OpenOffice.org Writer • A zip file • Contains various files, the most important of which is content.xml • content.xml is simply a plain-text file • Stata is good at writing plain-text files!
The Stata code • Creates the content.xml file by writing data with appropriate xml tags • Added to other files, zipped to .odt file • .odt file can be opened directly with Writer
Basics of XML <company name=“Dianthus Medical Limited”> <employee role=“speaker”> <firstname>Adam</firstname> <lastname>Jacobs</lastname> </employee> <employee role=“delegate”> <firstname>Flavia</firstname> <lastname>White</lastname> </employee> </company>
XML code for start of table <table:table table:style-name="Table42"> <table:table-column table:style-name="TabCol13"/> <table:table-column table:style-name="TabCol9"/> <table:table-column table:style-name="TabCol8"/> <table:table-column table:style-name="TabCol8"/>
XML code for table cells <table:table-cell table:style-name="cell1211"> <text:p text:style-name="Table_20_Contents"> Mileage (mpg)</text:p> </table:table-cell> <table:table-cell table:style-name="cell1111"> <text:p text:style-name="Table_20_Contents">N</text:p> </table:table-cell> <table:table-cell table:style-name="cell1111"> <text:p text:style-name= "Table_20_ContentsNumeric"> 52<text:s text:c="3"/></text:p> </table:table-cell>
Was this a lot of work? • 123 kB of code • 21 ado files • 45 Mata functions • And not finished yet!