Understanding Information and Data Quality: Frameworks and Key Concepts
This document explores the intricate relationships between information and data quality, emphasizing the significance of context in transforming data into meaningful information. It delves into the constructs of information integrity developed by Boritz et al., discussing the components of information quality such as reliability, relevance, and usability. It defines key metadata elements that enhance data quality, including context, content, and structure, while outlining different types of data and their respective roles in information systems. Through understanding these frameworks, organizations can improve data integrity and enhance decision-making.
Understanding Information and Data Quality: Frameworks and Key Concepts
E N D
Presentation Transcript
Information and data quality EfrimBoritz Hans Verkruijsse
Information • To inform is derivedfrom the Latin word Informare: in the sense of "to give form to the mind” • Information contributes to the development of the mind of human beings • Information quality research over the years
Reliability Relevance Useability INFORMATION QUALITY Boritz (2004) Information integrity
Information Integrity Framework Boritz et al. (2011)
Domains • Content: various types of information subject matter in information integrity • Process: consists of four key phases: input, process, output and storage that contribute to information integrity • Environment: assures an effective operating IS environment
Domains of informationintegrity Boritz et al. (2011)
Butwhatabout data • Gerald Trites et al (2010): A principaldifferencebetween the information and data is the needfor data to have context in order to beconsideredinformation. This is a hazydistinction, butneverthelessan important concept in principle. • Information is the result of processing data • Data are the building blocks of information
Tagged Data • Two types of data: • raw data: data without content. • tagged data: raw data with content. • XBRL uses tagged data; it tags the content to the raw data in order that this content can move along with that raw data.
Metadata,Key for data quality • Metadata is an important enabler of data quality. • Metadata is data that describes the content, context and structure of data. • Metadata contributes to the security, availability, understandability, consistency and verifiability of data.
Content, Context, Structure • Content: identifies the nature of the data and its purpose (e.g., a stream of sensor data, a set of transactions, a list of accounts receivable). • Context : describes the process(es) to which the data relates, relevant parties to a transaction and the duration or instant in time that the data relates to. • Structure: describes the logical and physical organization of data, and the format of and relationships between its elements.
Types of Metadata • Explicitmetadata attached to the content (e.g., XBRL taxonomies and linkbases). • Explicitmetadata that is a central part of the process but is not part of the reported content (e.g. who is accountable for the tagging). • Implicitmetadata that provides the context for understanding the content.
Types of Metadata • There is almost no limit to the amount of data that can be captured about data. • The following slides are intended to be a set of concise but comprehensive items that can be used to manage data quality.
Metadata elements • Description: nature of the data. • Purpose: primary use of the content. • Origin:source of the data (e.g. internal, external). • Owned by: who is the owner of the data. • Custodian: who maintains the data. • Classification for security/privacy: label assigned to the data (e.g. public, internal, confidential, etc). • Access privileges: requirement to access the data.
Metadata elements • Location: the location that the data originated from. • Version: to enable version control. • Date/Timestamp: the date and time the data was generated and/or modified. • Retention/Disposal Requirement: the duration that the data is to be retained for. • Audit trail: to allow the tracing of the data back to the source. • Assurance: the level of verification that the data has undergone.
Data Integrity Framework? Boritz et al. (2011)