1 / 24

HAND OUTS DExT Project UK Data Archive September 2007

HAND OUTS DExT Project UK Data Archive September 2007. Exploring text online. DDI record. Standard level 1 and 2 of DDI2 used Some fixed vocabulary used for qualitative data types, data formats and data collections methods File level 3 attributes not used. UKDA DDI record – HTML.

qiana
Télécharger la présentation

HAND OUTS DExT Project UK Data Archive September 2007

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HAND OUTSDExT ProjectUK Data Archive September 2007

  2. Exploring text online

  3. DDI record • Standard level 1 and 2 of DDI2 used • Some fixed vocabulary used for qualitative data types, data formats and data collections methods • File level 3 attributes not used

  4. UKDA DDI record – HTML

  5. UKDA DDI record - XML <sumDscr> <timePrddate="1870-00-00"event="start">1870</timePrd> <timePrddate="1973-00-00"event="end">1973</timePrd> <collDatedate="1969-00-00"event="start">1969</collDate> <collDatedate="1973-00-00"event="end">1973</collDate> <nation>Great Britain</nation> <geogUnit>Regions in England, Scotland and Wales</geogUnit> <anlyUnit>Individuals; Families/households</anlyUnit> <universelevel="study">Location of units of observation:</universe> <universelevel="study">National</universe> <universelevel="study">Population keywords:</universe> <universelevel="study">Families</universe> <universelevel="study">Population:</universe> <universelevel="study">Men and women born between 1870 and 1908</universe> <dataKind>Textual data; Numeric data</dataKind> <dataKind>in-depth interview transcripts</dataKind> </sumDscr> </stdyInfo> <method> <dataColl> <timeMeth>Cross-sectional (one-time) study</timeMeth> <sampProc>Quota sample derived from the occupational census of 1911, clustered and stratified by region and social class</sampProc> <deviat>449 (qualitative); 444 (quantitative)</deviat> <collMode>Face-to-face interview; Compilation or synthesis of existing material</collMode> <weight>No weighting used</weight> <cleanOps>A</cleanOps> </dataColl> </method> <dataAccs> <setAvail> <accsPlac>ESDS Qualidata, UK Data Archive</accsPlac> <collSize>Variables per Case: 191 variables per case &lt;br&gt; </collSize> </setAvail> <useStmt> <specPerm>2003A</specPerm> <restrctn>The depositor has specified that registration is required and standard conditions of use apply. The depositor may be informed about usage. See &lt;a href='/orderingdata/termsandConditions.asp'&gt;terms and conditions&lt;/a&gt; for further information.</restrctn> <contact>Help desk: qualidata@esds.ac.uk</contact> </useStmt> </dataAccs>

  6. TEI file prepared for online delivery

  7. Basic TEI mark-up for our text files

  8. More TEI mark-up? • three basic groups of structural features • defining idiosyncrasies in transcription • links to analytic annotation and other data types (e.g.. thematic codes, concepts, audio or video links, researcher annotations) • identifying information such as real names, company names, place names, occupations, temporal information • we have piloted an NLP system to semi-automated mark up of named entities

  9. Identifying elements • identify atomic elements of information in text • Person names • Company/Organisation names • Locations • Dates • Times • Percentages • Occupations • Monetary amounts • example: • Italy's business world was rocked by the announcement last Thursday that Mr. Verdi would leave his job as vice-president of Music Masters of Milan, Inc to become operations director of Arthur Anderson 13

  10. 14

  11. Progress on textual mark-up • text mining collaboration important • two bids in for key word extraction systems to help conceptually index qualitative data • Hence other need for annotation schema! • Some CAQDAS software also employ NLP tools for autocoding

  12. CAQDAS Examples • Atlas-ti • HyperResearch • Max-QDA • NU*DIST 6 • N*VIVO 2 • QDA Miner • QUALRUS • Weft QDA

  13. QDA Miner interface

  14. Atlas-ti interface

  15. Atlas-ti output

  16. QDA Miner output

  17. QuDEx File

  18. QuDEx File

  19. QuDEx File

  20. QuDEx File

More Related