90 likes | 204 Vues
This paper introduces the Key Variable Mapping System (KVMS), developed at the Cathie Marsh Centre for Census and Survey Research, University of Manchester. It emphasizes the importance of the data environment, which includes context, development, and dissemination of datasets. KVMS focuses on harmonizing variables across different datasets, utilizing metadata to establish connections and generate consistent coding graphs. The system enhances analysis by allowing researchers to assess matching possibilities between datasets, ultimately providing richer ground scenarios for data interpretation.
E N D
The Key Variable Mapping System Mark Elliot, Duncan Smith, Elaine Mackey and Kingsley Purdam Cathie Marsh Centre for Census and Survey Research University of Manchester Mark.Elliot@manchester.ac.uk
The Data Environment • Broadly, a dataset’s environment is the background context of its development, release, dissemination and use
The Data Environment • Crucially a dataset’s environment is populated by other data • And we would like to know more about that data • Such knowledge would allow us to generate grounded scenarios
Form field analysis Assumptions: • If a data collection instrument (form) asks for personal information then that information will be stored on a database of individual records • The data will be stored at the level of detail that it is collected • So the forms give us metadata
Key Variable Mapping • Metadata is stored in a metadatabase • Paradigm process is to map two sets of metadata • Using a prevalence metric and form classifications allows matching possibilities to be assessed
Key Variable Mapping • For each variable in the metadatabase • If both datasets have common variables (e.g. ethnicity) • Find the join in the graph between the codings employed • That join is the harmonised coding • i.e. The specification of the key variable
KVMS 2 • Includes a GUI for building coding graphs • Generates the harmonisation graphs automatically • Tests for consistency on data entry