140 likes | 291 Vues
Shawn McClure Software Engineer CIRA, Colorado State University mcclure@cira.colostate.edu 970-491-8455 Projects: Visibility Information Exchange Web System (VIEWS) Interagency Monitoring of Protected Visual Environments (IMPROVE) Air Toxics Data Archive (ATDA)
E N D
Shawn McClure • Software Engineer • CIRA, Colorado State University • mcclure@cira.colostate.edu • 970-491-8455 • Projects: • Visibility Information Exchange Web System (VIEWS) • Interagency Monitoring of Protected Visual Environments (IMPROVE) • Air Toxics Data Archive (ATDA) • WRAP Technical Support System (TSS) • Background: • Systems Analyst • Database Architect • Programmer
“VIEWS-based” web sites... VIEWS ATDA IMPROVE
Why am I here? • To help ask a question: • Can we improve upon the data management system* used by the Air Quality Group at CNL? • Some possible goals: • Understand the current system • Identify issues, goals, and requirements • Design a new system • Implement the new system • Maintain and evolve the system * System: consists of the collection of manual and automatic processes by which information is collected, managed, and disseminated, and by which work is done.
Some common problems with data management… • Data Redundancy and Confusion • Results when different people independently collect or copy the same piece of information • Can lead to conflicting naming and coding conventions • May result in confusion about which version of the data is most recent or most correct • Program – Data Dependence • Tight relationship between data and the programs required to update and maintain that data • A change in data requires a change in all the program that work with the data • Changes in data type, field length, etc. cause change in programs
Some common problems with data management: (cont’d) • Lack of Flexibility • Difficulty delivering ad hoc reports and/or responding to unanticipated information requirements in a timely fashion • Handling ad hoc requests: Data is in the system, but is very expensive (in time and effort) to assemble and organize • Poor Security • There is no mechanism for knowing who is accessing the data and how they're modifying it • Access to the data can be unsystematic and uncontrolled
Some common problems with data management: (cont’d) • Lack of Data Sharing and Availability • Finding data can be difficult • Retrieving data can be difficult • Because pieces of information in different files and different parts of the organization cannot be related to one another, it is virtually impossible for information to be shared or accessed in a timely manner • Information cannot flow freely across different functional areas or different parts of the organization
Functions of a Data Management System • Getting data into the system • Working with data while it is in the system • Getting data out of the system
A data management system in more detail… Import Validation Storage Retrieval Presentation Transformation Source Data Database Rules Source Data Analysis Interpretation Program Logic Source Data Front End Back End Import: Getting data into the system Validation: Ensuring data accuracy Storage: Managing data, backup, and archival Transformation: Sorting, joining, aggregating Retrieval: Getting the data out Presentation: Displaying the data Analysis: Making the data understandable Interpretation: Making the data usable
Two Possible Approaches: Transactional Database and a Data Warehouse
A Hybrid Approach: Two Interrelated Relational Database Systems • OLTP: • Functions as the “back-end” database • Fully relational and in 3rd normal form • Used for data import, validation, and management • Technologies: Microsoft SQL Server • Data Warehouse Generation System: • Extracts data from the OLTP • De-normalizes and transforms data • Loads data into the Data Warehouse • Builds table indexes • Archives “snapshots” of the database • Technologies: VB, stored procedures • Data Warehouse: • Functions as the “front-end” database • Uses a de-normalized “star schema” • Used for querying and archiving data • Automatically generated from the OLTP • Technologies: Microsoft SQL Server 12
General Tasks and Associated Questions • Determine our goals: What are we trying to achieve? • Design a strategy: How are we going to achieve it? • Identify problem areas: What do we have to watch out for? • Prioritize tasks: What is most important for success? • Adjust scope: What set of goals is most realistic? • Recommend alternatives: How else could we do things? • Allocate resources: How do we support our efforts? • Realign expectations: How do we communicate any adjustments? • Define milestones: How do we know when we're done?
Some Issues and Concerns… • What components/aspects of the current CNL-AQG data management system(s) are we interested in examining? • How do we determine our priorities? (i.e. What should come first, next, and later?) • How do we make any necessary “transitions” with minimal impact to current operations? • What new “learning curves” will be involved? • How will we need to “shift our paradigms”? • How do we maintain a new system? • How do we know when we’ve achieved what we want? Thanks! mcclure@crocker.ucdavis.edu