1 / 37

Toward Digital Government: The Case of Government Statistics

Toward Digital Government: The Case of Government Statistics. Gary Marchionini University of North Carolina at Chapel Hill www.ils.unc.edu/govstat NSF Grants EIA 0131824 and EIA 0129978

lew
Télécharger la présentation

Toward Digital Government: The Case of Government Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill www.ils.unc.edu/govstat NSF Grants EIA 0131824 and EIA 0129978 Principal Investigators: Gary Marchionini, Stephanie Haas, Ben Shneiderman, Catherine Plaisant, and Carol Hert

  2. Digital Government: Leveraging IT • Government information dissemination • Websites • Other publications (no mass emailings yet) • Transactions • Registrations • Census, regulatory filings • Taxes • Policy making • E-voting • E-rules • Our work focuses on statistical information and agencies as many important decisions by policy makers and citizens depend on statistics

  3. Preliminary Work1996-2000 • Human needs • Interviews (agencies, public) • Transaction log analysis • Email content analysis • System development and testing • Novel interfaces • Information architecture • Usability studies

  4. Focus on Tables1998-2000 • Table browser • Java applet • DTD for tables (DC and DDI influence) • XML protocol • Mapping metadata elements to interface control mechanisms • Piping data from large databases to applet • User studies • Metadata to aid understanding

  5. Statistical Knowledge Network2003-2006 • Create SKN prototype with agency partners • Integration • Horizontal integration across federal agencies (BLS, EIA, NCHS, Census, SSA, NASS) • Vertical integration from local/state • Focus on non-specialists • Help crucial • Metadata drives help • User interfaces are the intermediaries to link people and data • Find what you need, understand what you find

  6. end users: interact with data from information/concept perspective, not agency perspective end user end user end user end user end user end user end user Data Flow Statistical Ontology Domain Ontologies agency data with integrated metadata I n t e r f a c e s agency with multiple metadata repositories Distributed Public Intermediary: variable/concept level, XML-based incorporating ISO 11179 and DDI providing java-based statistical literacy tools to user interfaces U s e r agency backend data and metadata membrane agency backend data and metadata Domain Experts End User Communities firewall

  7. .gov Private Work Space Private Work Space Private Work Space Objects Objects Objects Actions Actions Actions Statistical Knowledge Network Architecture SKN Consortium Agencies …………. SKN Registry Objects Reports metadata Tables metadata People metadata Glossary Annotations Actions Contribute Find Display Annotate Understand Manipulate Collaborate Ontology Rules & Constraints ….. …..

  8. Interface Prototypes:Find, Display, Understand; Leverage Metadata, Glossary, Ontology • Relation Browser • Mulitlayered help: treemaps, video help • Animated Glossary • Contextualizer • PairTrees • Spatial audio for maps • Missing Data

  9. Use Case Scenarios to Guide Design • Based on discussions with agency partners • 20 scenarios • 4 detailed with in depth resources located • Used to ground ongoing work

  10. Relation Browser++displaying all webpages EIA

  11. RB++ with Cursor Over Residential Sector

  12. RB++ showing ‘hous’ typed in title field

  13. Multi-layered interfaces1 level3 levels of growing complexity map+table +filters map+table map+table +filters +scatterplot map+table +filters +scatterplot

  14. Animated Demonstration Features

  15. Script Guidelines • Base the script on a live demonstration (never on a written description) • Focus on tasks(not tours of widgets or conceptual overviews) • Act out the interaction (with minimum description) then describe results in context of task • Start with a tour of main screen components (orient and introduce vocabulary) 5-10 sec. max • Plan a linear sequences made of very short autonomous chunks (15-60 sec.) • Map the chunks to existing online documentation • Show text title at beginning of each chunk • Carefully synchronize voice and visual (hard when alone) • Provide duration and file size for individual chunk

  16. Interactive Glossary Development Tools • Provide foundation for content development • Separate content development from presentation development • Reduce overall development time • Maximize reuse of existing elements • Create multiple presentations from a single content development effort

  17. Animation Template

  18. ContentFoundationTemplate(SIG) Question initial motivation Answer overview, definition Process explanation, equation Example Result statistic, answer Review summary, interpretation

  19. Animation Template • Consistent display and interaction for all animations • Presents animation and explanatory text simultaneously • Navigate (forward and back) through animation segments • Complete review of text at any time

  20. Animation Template • Three pieces: text, animations, template • Text is tagged with content section tags in a separate text file • Animation consists of segments in individual animation files • Text and animation segments coordinated by placement in template

  21. SKN • Ontology • DTD / XML Schema • Interface Tools • Statistical Interactive Glossary (SIG) • Ontology Applications • Knowledge organization • Content and terminology control • Data integration • Query support • Automatic classification support • Reasoning mechanism • Others ontology • Semantic level • Classes • Relationships • Constraint rules modeling DTD/XML Schema • Structural level • Elements • Attributes • Datatypes implementation

  22. Domain knowledge estimate benefit salary age poverty estimate aged unit poverty unit family household wage earning income distribution • Operational knowledge <anlyUnit>aged unit</anlyUnit> <universe>married couples living together, with husband or wife aged 65 or older</universe> SSA FIFARS Census Bureau

  23. Project DTD • Investigate DDI and ISO 11179 • Leverage DDI and data cubes • Markup a set of objects • Tables • Reports/press releases • Use markup to build added value search (find what you need) and help (understand what you find) support into interfaces

  24. The Basic Structure docDscr: description of the markup-what is being marked-up, who marked it up, etc. entDscr_1: description of an entity within the marked up document entDscr_2: description of an entity within the marked up document stdygrpDscr: describes the “group” to which an entity or document belongs such as a survey program nCubeDscr: used when entity is an aggregated table varDscr_1: description of each variable within an entity, study group or document varDscr_2: description of each variable within an entity, study group or document fileDscr: descripes physical file structures for nCubes

  25. One Example of How the DTD Helps The DTD can help bring the “expert knowledge” to the less expert user and bring relevant information together by enabling searching via variables as well as subjects/keywords

  26. <var name="age" dcml="0" intrvl="discrete" aggrMeth="count" measUnit="aged units" scale="x1" origin="0" nature="interval" additivity="" temporal="no" geog="no" geoVocab="" catQnty="4"> <labl source="producer" level="variable">age</labl> <universe level="variable" clusion="I">persons</universe> <catgryGrp ID="CG1_1" catgry="C1_1 C1_2 C1_3 C1_4"> <lablsource="producer"level="catgryGrp">Age</labl> </catgryGrp> <catgry ID="C1_1"> <catValu ID="CV1_1">1</catValu> <lablsource="producer" level="catgry">65-69</labl> </catgry> <catgry ID="C1_2"> <catValu ID="CV1_2">2</catValu> <lablsource="producer" level="catgry">70-74</labl> </catgry> <catgry ID="C1_3"> <catValu ID="CV1_3">3</catValu> <labl source="producer"level="catgry">75-79</labl> </catgry> <catgry ID="C1_4"> <catValu ID="CV1_4">4</catValu> <labl source="producer" level="catgry">80 or older</labl> </catgry> </var> Median income, by age, 2001

  27. Discovering Metadata • Hybrid machine learning approach • Crawl website • Create term document matrices • Use k-means clustering with small K to fit on screen in RB++ • Revise • Use structure in the existing sites to train a classifier • For small n of concepts, classify site

  28. What should these topics be, and how do we know if we’ve found the right names for them? Combining Machine Learning and Dynamic Interfaces

  29. Combining Machine Learning and Dynamic Interfaces How do we assign thousands of documents to their respective topics?

  30. Initial, Unstructured Approach doc doc doc doc doc doc doc doc doc doc doc doc doc doc

  31. Initial, Unstructured Approach doc doc doc doc doc doc doc doc doc doc doc doc doc doc

  32. Initial, Unstructured Approach doc doc doc doc doc This approach yielded intuitively coherent clusters. But the clusters fall at too fine a level of granularity, while also wasting large portions of the data. Clustering Based on Word Distributions doc doc doc doc doc doc doc doc doc

  33. New Approach, Semi-Supervised

  34. New Approach, Semi-Supervised doc doc doc doc doc doc doc doc doc doc doc doc doc doc doc doc

  35. New Approach, Semi-Supervised doc doc doc doc doc doc doc This approach capitalizes on the agencies’ efforts and expertise, and so far seems to yield superior results. However, the amount of training data is very sparse, and the observed categories have high correlation in some cases. Our current work addresses these tuning issues. doc doc doc doc doc doc doc doc doc

  36. Vertical Integration: Agriculture Collection agents USDA / NASS State Statistical Office Farmers & Producers Statistical Consumers Obtain data from agencies Supply data to agencies State Cooperative Agency (Dept. of Agriculture,etc.)

  37. Multiple Research Threads for the SKN • Interfaces • Metadata and Ontology • Multi-leveled help • Automatic slicing and dicing • User needs and user testing • Cross agency cooperation • See www.ils.unc.edu/govstat

More Related