1 / 54

Data Warehousing and Recordkeeping

Department of Fish and Game BUSINESS INFORMATION SYSTEMS. Data Warehousing and Recordkeeping. ARMA – Greater Sacramento Capital Chapter Wednesday, February 16, 2005. PROS. CONS. Transforming Data into Knowledge. Data. Meaning Interpretations Trends Impacts Decisions. 42.

jadzia
Télécharger la présentation

Data Warehousing and Recordkeeping

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Department of Fish and Game BUSINESS INFORMATION SYSTEMS Data Warehousing and Recordkeeping ARMA – Greater Sacramento Capital Chapter Wednesday, February 16, 2005

  2. PROS CONS Transforming Data into Knowledge Data Meaning Interpretations Trends Impacts Decisions 42 Information Numbers Dollar Amounts Dates Codes Descriptions Text Images Video Contexts Relationships Rules Hierarchies Translations Formats Knowledge

  3. PROS CONS So Where’s the Snag??? So Where’s the Snag??? Data • Flawed • Contradictory • Unreliable • Speculative • Wrong Information • Isolated • Disconnected • Inconsistent • Incompatible • Proprietary • Confusing Knowledge • Conflicting • Mismatched • Laborious • Outdated • Unreliable

  4. Islands of Information(Data, data everywhere, but…) Heterogeneous Systems Lack of common interfaces Data access hurdles Data turf wars Data export limitations Inconsistent primary keys Inconsistent data formats Ignorance of where data is located or that it even exists!

  5. Example: DFG’s Islands of Information Access/Excel Desktop Data CalStars Accounting System SCO Mainframe DFG Oracle Databases

  6. Solution: The Data Warehouse • Consolidates diverse enterprise data • Provides easy, intuitive access to information • Generates flexible reports • Puts information in the hands of those who need it!

  7. Do You Need a Data Warehouse?

  8. Questions to Ask Yourself • Can I get all the information that I need in one place? • How long after something happens can I see it? • Can I present the data the way I need to? • Can I trust the data that I get when I get it?

  9. Why a Data Warehouse??? • Consolidation • Organization • Accessibility • Timeliness • Format • Reliability

  10. The Data Warehouse… • Consolidates some or all of an enterprises’ data assets, from both internal and external systems, in a centralized location • Enables wide-ranging analyses and decision support functions through data summarization and reorganization • Provides on-line access to historical data that may otherwise be kept off-line or archived and would be more difficult and slower to retrieve • Warehouse data can be updated as often as required, from real-time updates to nightly, weekly, monthly, etc. • Data can be formatted in multiple ways based on the needs of the end-user • Data cannot be inadvertently deleted, altered or corrupted.

  11. The Data Warehouse… • Pre-establishes data relationships for quick, intuitive manipulation • Uses common names rather than codes or abbreviations for usability and understanding • Design is optimized for OLAP queries so access, even of massive quantities of data, is very rapid • Prevents OLAP queries from degrading OLTP performance • Built-in query governors prevent “queries from hell”.

  12. Real-World Examples… Data Problems and Data Warehouse Solutions

  13. Problem #1: The Budget Drill The HR Branch • Know about positions and who is in them, but not how they are funded • Position and personnel data is in an external (SCO) database • SCO exports data to binary or CSV files • BMS Cannot accept binary or CSV files from SCO • Some positions are in SCO but not in BMS and some are in BMS but not SCO • Requires manual reconciliation The Budget Branch • Told to trim positions • Prefer to minimize layoffs by removing vacancies • Position and funding information is in an internal database called the Budget Management System (BMS) • Don’t know who is in what position • Can’t trim positions without this information • Need personnel information from HR

  14. Solution #1: The Position Control Warehouse • Accepts data extract from BMS • Accepts CSV datafile from SCO • Reconciles all possible matching records • Flags records in BMS not in SCO • Flags records in SCO not in BMS • Generates detailed reports based on position, employee filling position, and position funding

  15. Problem #2: Fiscal Management The Fiscal Branch • Tracks encumbrances and expenditures at the expenditure object detail level • Stores all transaction data in CalStars accounting system • Disperses voluminous detailed transaction reports from CalStars to every region and division • Cannot upload data into BMS • Must manually tie allotments to expenditures to generate “Expenditure against Allotment” reports The Budget Branch • Budgets creates and tracks allotments each fiscal year based on organization, program and general expenditure category • Allotments are continually adjusted based on both internal and external demands • Data is stored in an internal Oracle database -- BMS • Presently does not upload allotment data to CalStars • BMS contains no expenditure or encumbrance data

  16. Solution #2: Allotment/Expenditure Warehouse • Accepts allotment data extracts from BMS • Accepts binary datafile of expenditures and encumbrances from CalStars • Rolls up CalStars data to the same summary level as the BMS data • Matches BMS allotments with CalStars expenditures and encumbrances • Retains administration costs in administration • Permits slicing and dicing or expenditures against allotments by organization, program, object of expenditure, fund, fiscal year and fiscal month • Allows for month-to-month or year-to-year comparisons • Easy Web interface for distribution of reports

  17. Problem #3: Procurement and Payment Reporting The Information Technology Branch • Developed a comprehensive procurement and payment system (PPS) • PPS handles many DFG purchases and tracks them through the entire approval process • PPS tracks invoice payments and creates claim schedules for generating SCO warrants • Unfortunately, almost no reporting capability, and little or no resources to develop reports. Regions and Division • Enter purchase requests for approval and invoices for payment • Can only look up individual transactions • Cannot obtain either summary or detailed reporting of procurements or payments by organization, vendor, program, expenditure type, fund, date range, etc.

  18. Solution #3: Procurement & Payment Warehouse • Creates a simple, denormalized data structure that anyone can understand • Replaces confusing codes with descriptions for an intuitive interface • Tuned for OLAP performance • Permits fast and easy access to PPS data • Both canned and ad hoc reporting available • Easy Web-based interface enables distributed access without the need for software installation or IT support

  19. PROS CONS Data Information BUDGET MANAGEMENT SYSTEM • Allotments by Index, PCA, Object • Adjustments by Transaction (event) • Position and position funding data PROCUREMENT & PAYMENT SYSTEM • Procurement information • Payment tracking • Vendor information CALSTARS • Expenditures at transaction level • Labor distribution by employee • Activity data by employee & expenditure SCO • Current Employee data • Employee history • Timesheet data Knowledge DATA WAREHOUSES • Standard management reports • Customized reports & queries • Ad hoc reporting capability • Ability to summarize data • Ability to drill down into details • Analyze results from year-to-year • Analyze results from month to month • Analyze results across Organization MANAGEMENT DECISIONS • Informed decisions by Management • Supported by factual information • Defensible numbers and projections • Instant distribution of information

  20. Building a Warehouse …one brick at a time!

  21. OLAP (On-Line Analytical Processing) Designed for time-based business analysis Small number of users Optimized for large, complex, unpredictable queries involving many rows of data Small number of denormalized tables No data validation required since all interaction is read-only Data is static OLTP (On-Line Transaction Processing) Designed for real-time transactions Large number of users Optimized for small, fast, pre-planned inserts, updates, and deletes Large number of highly normalized tables Requires significant data validation which uses a great deal of overhead Data is highly volatile Data Warehouse DatabasesOLTP vs. OLAP

  22. Development Considerations • What is the problem to be solved? • Is there a better alternative solution? • What resources are available for this project? • Is there management buy-in and active support? • What and where are all the sources of data? • Are all the data sources accessible? • What is the format of each data source? • How will the data be obtained and uploaded? • Who/how many will need to access the warehouse? • How often does the data need to be refreshed? • What are the security/access restrictions/constraints?

  23. Deployment Considerations • How will the warehouse be accessed? • Who will determine user access rights? • If necessary, who will handle software installations? • Who will develop end-user documentation? • Who will provide end-user training? • Which users will participate in the pilot rollout? • How will additional users be phased in? • How will feedback be accepted and evaluated? • How will a successful rollout be gauged?

  24. Maintenance Considerations • Who will manage the database servers? • Who will manage the Web and/or Application servers? • Who will provide hardware and network support? • Who will manage end-user accounts? • Who will monitor replication/update processes? • Who will handle end-user problems, complaints, questions, concerns and suggestions? • How will enhancements or bug fixes be addressed?

  25. Data Warehouse Design • Based on “Star Schema” or “Dimensional” Architecture • Operational Data Store (ODS) reflects the corporate data model • Central “Fact” tables store denormalized, consolidated, organized information about a specific subject • Peripheral “Dimension” tables permit rolling up (summarizing) or drilling down (to greater detail) as needed (may be shared by many “fact” tables)

  26. BMS Allotment Data FACT DATE Dimension OBJECT Dimension FUND Dimension PCA Dimension INDEX Dimension Financial Mgmt. Rpts. (Standard) OBJECT KEY Data-load Filters CalStars Expenditure Data FUND KEY DATE KEY INDEX KEY PCA KEY Financial Mgmt. Rpts. (Custom) Allotment Expenditure Encumbrance ALLOTMENT/EXEPENDITURE WAREHOUSE MONTHLY DATA LOAD Fiscal Year Fiscal Month Month Abbr. Section Sub-Section Unit Sub-Unit Sub-Sub-Unit Div-Reg ID Div-Reg Code Index Allotment data includes baseline and all processed adjustments Expenditure data includes all expense and encumbrance data Expenditure data rolled up to same object level as allotments Distributed Admin. and Allocated Costs are shown as offsets in their source programs and spread to their source funding Program Element Pca Monthly Snapshot Fund Rollup Fund Source Fund Code Fund Detail Group Category Object Rollup Object ID Agency Detail

  27. Criteria for Success • Who determines whether the problem was solved? • What is “acceptable” performance? • How stale can data be and still be useful? • What metrics will be used to measure usage, response speed, data validity and integrity? • How will “user acceptance” be determined?

  28. Recordkeeping …issues for Records Managers and Administrators

  29. Data and Records: Enterprise Assets • Data should be collected and retained based on current standards and guidelines • It should be made available to everyone, within the requirements of security and confidentiality • Data owners and data stewards should be identified and should participate in data-related decisions • Records management policies must be developed, published and enforced • Such policies may need to be modified as new technologies emerge

  30. Data Warehouses are not ERMS • If the primary purpose of ERMS is to provide a secure repository for authentic and reliable data about “business events”, including both the data (content) and the context and structure, data warehouses usually do not meet these criteria • Both records and data warehouse data are read-only (permanent) • Data Warehouses are primarily used for transforming data into information, so they “repackage” existing data/records • They generally do NOT capture all the information about transactions necessary to qualify as an ERMS • They also generally do not manage records over their entire life-cycle, but instead capture “snapshots” in time • However, a data warehouse designed as an ERMS, could easily be implemented

  31. Questions to consider… • Can warehouse data be considered “records”? • How might records retention policies change in the face of fast and easy access to meaningful historical data? • Do electronic warehouse records have the same retention policy as paper records or other electronic records? • Is the “ownership” of Warehouse data different than the ownership of the original records? • How do “design” considerations affect records, when organizing Warehouse data? • Does the fact that Warehouse data is read-only impact how policies apply to it?

  32. User Interface An Example of a Web-Based Data Warehouse Reporting Tool

  33. How to Access the Warehouse Enter Intranet Address: http://dfgitbla03:7778/discoverer/plus Step 1:

  34. Connection has to be set up the first time you access Discoverer How to Access the Warehouse Step 2: Select connection link: DFG Data Warehouses

  35. How to Access the Warehouse Step 3: Enter Password

  36. How to Access Canned Reports Step 4: Select this option to Open anexisting workbook

  37. How to Access Canned Reports Step 5: After selecting Open an existing workbook, identify whether the report is in the Database or if it is Scheduled

  38. How to Access Canned Reports Step 6: You will be presented with a list of available workbooks. Highlight the one you wish to run and click on the Open button.

  39. How to Access Canned Reports Step 7: The system will retrieve the selected workbook and populate it with the appropriate data. From here you are ready to begin working with the information.

  40. How to Create Ad Hoc Reports Step 1: Select this option to Create a new workbook

  41. How to Create Ad Hoc Reports Step 2: Select this option to create a Page-Detail Crosstab report

  42. How to Create Ad Hoc Reports Step 3: Select the Facts that you want to include in your report

  43. How to Create Ad Hoc Reports Step 4: Select the Dimensions that you want to include in your report

  44. How to Create Ad Hoc Reports Step 5: Drag the Page “Group-By”Dimensions to the top row Drag the Record “Group-By”Dimensions to the left-hand column

  45. How to Create Ad Hoc Reports Step 6: Format both Data and Headings as desired

  46. How to Create Ad Hoc Reports Step 7: Then Click on Finish to run the report

  47. How to Create Ad Hoc Reports Step 8:

  48. How to Export Data Step 1: Select File/Export, File/Export to HTML or File/Export to Excel to export the data to an external format

  49. How to Export Data Step 2: Choose from one of numerous available external file formats

  50. How to Export Data Step 3: Click on the OK button to finish

More Related