1 / 12

Explaining Statistical Models through Metadata

Explaining Statistical Models through Metadata. Andrew Westlake Survey & Statistical Computing & Imperial College London WWW.SASC.CO.UK. OPUS - objectives. Data Integration – the Holy Grail Can we bring together information from multiple datasets

chickoa
Télécharger la présentation

Explaining Statistical Models through Metadata

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Explaining Statistical Models through Metadata Andrew Westlake Survey & Statistical Computing & Imperial College London WWW.SASC.CO.UK

  2. OPUS - objectives • Data Integration – theHoly Grail • Can we bring together information from multiple datasets • Can we combine what we already know with evidence from new datasets • In ways that are • Coherent • Formalised • Transparent • Central role of the Statistical Model Model + Knowledgei + Evidencei Model + Knowledgei+1 Explaining Statistical Models through Metadata

  3. Examples of Multiple Data Sources • UK Crime Statistics • Household Surveys – British Crime Survey • Police Statistics – Reported Crime • Different selection mechanisms for who responds and what is reported • Transport for London • Household Surveys (LATS) • On-Board surveys (RODS, BODS) • Road-side counting and interviews • Automatic sensors – ticket gates, road loops • In-car tracking • Census • Different sources give different partial but overlapping views of the same underlying system Explaining Statistical Models through Metadata

  4. Information from Multiple Datasets Explaining Statistical Models through Metadata

  5. Meta-data about Statistical Models • Generic Issue • How to record information about the • Specification • Processing • Associated with Statistical Models • Essential for Opus • Relevant whenever users are distant from modellers • Official Statistics • Archives • Statistical Dissemination in general • Purpose is to support end users • Confidence in Statistical Results • Understanding of model form • Reuse or extension • Structure and Functionality Explaining Statistical Models through Metadata

  6. Statistical Models • Interest in measures which summarise the underlying system • Can be derived - may even be stochastic • Explicit representation of variability • Errors in measurement processes, surrogates for intended measures, variability of respondents • Stochastic (distribution) components in Influence relationships • Relationships between measured (and unobserved) variables • Linear Regression • Generalised Linear Models • Conditional Independence, … • Mathematical forms, deterministic and stochastic • Parameters for Distributions, Relationships and Measures • Fit model using Data • Methods usually tied to model form • Yields estimates of parameters, with precision (uncertainty) Explaining Statistical Models through Metadata

  7. Meta-data in Opus • Analyse Requirements • Generic representation of Statistical Models (not just Opus) • Variables, Parameters, Distributions, Relationships (mathematics) • No new wheels – assume DDI, etc. or equivalent • Design Structures • StatModel – Object design in UML • Descriptions of actual models in XML • Implement Functionality • Presentation Tools • Templates and Applets • Use R service for statistical displays • Demonstration web site • Technical details • Separate discussion Explaining Statistical Models through Metadata

  8. StatModel Components • Multiple Statistical Models of any system • Focus on different sub-systems • Different levels of abstraction • Functional Form of Model Specification • Variables, Parameters • Derivations, constraints and stochastic relationships • Fitting Steps • Links to datasets – how are Data variables linked to Model ones • Methods used and outcomes • Knowledge States • Knowledge (uncertainty distributions) for Parameters • Each Fit produces a new State Explaining Statistical Models through Metadata

  9. Demonstration • Public Demonstration Site • http://155.198.92.106/ • London 2 • Stylesheet – listings, mathematics • Alternative Model Forms • Model Diagram, Process diagram • WP08 • Model Sequence • WP11 • Understanding complex WinBUGS • Show Doodle and Script in WinBUGS • Documentation in StatModel • Influence Diagram in StatModel Explaining Statistical Models through Metadata

  10. Conclusions • Propose structure for storing information about Statistical Models • Seems to work well for us • Refinement and application outside Opus needed • For end users, so must address presentation • Some basic tools demonstrated • Specialised solutions for application domain usually needed • Meta-data capture is difficult issue • Integration into modelling applications • Encourage modellers to document and explain their choices • Scope for further development www.opus-project.org – www.sasc.co.uk Explaining Statistical Models through Metadata

  11. Acknowledgements • Rajesh Krishnan, Imperial College London • Implementation of the web application • Miles Logie, Minnerva • Saikumar Chalisani, ETH Zürich • Contribution to initial ideas about StatModel • Software Used • hyperModel – XMLModeling.com, David Carlson • UML modelling for XML Schema • Formulator - www.hermitech.ic.zt.ua • MathML editor, integrates with XML Spy • XML Spy – www.altova.com • XML Editor and associated applications • JUNG - jung.sourceforge.net/ • Java Graph Visualization and Layout Explaining Statistical Models through Metadata

  12. End Explaining Statistical Models through Metadata

More Related