390 likes | 502 Vues
This document provides a comprehensive overview of the Michigan School Data functionalities as of May 2012, including district summaries, assessment accountability, college readiness indicators, and reports on student progress and educator effectiveness. It also covers CEPI's data quality processes—examining anomalies, submission timelines for various applications, and ongoing improvements to ensure accurate data reporting. Stakeholder engagement is emphasized for driving enhancements in educational outcomes across Michigan. ###
E N D
MI School Data – Functionality Overview • District/School • Summary • Quick Facts • Openings/Closings • School data file • Assessment and Accountability • Dashboard and Report Card • MEAP, MME, MI-Access, and ACT • College Readiness Indicator (ACT scores) • Students not tested report • Assessment revised cut scores • Student • Graduation/Dropout • Non-resident Report • Student Count • Staffing/Financial • Educator Effectiveness • Effectiveness Ratings (Principals only 2010/11) • Evaluation Factors • Postsecondary Reports by High School/District • Enrollment/Credit Accumulation • Remedial Coursework
MI School Data – Current Work • Earliest Priorities: • Migration of Data for Student Success (D4SS) Dynamic Inquiries • Additional dashboard metrics (Best Practices) • K-3 Pupil Teacher Ratio, General Fund Balance, Salaries, Days of Instruction • Additional displays/reports from MSLDS data sources: • Pupil Attendance, Retention in Grade, Pupil Mobility • Usability improvements • “Front Page,” Location Selection, “Sticky Settings” • User Administration Improvements • Early Childhood • More stakeholder discussion required • Additional K-12 • Finance - Source: FID • Staffing - Source: REP • Special Education public reporting and data portrait queries • Top to Bottom Listing of Schools • Postsecondary • Enrollment, Credit Accumulation, & Remediation - User interface • By High School • By Institution of Higher Education • Requirements initiated for additional reports • More stakeholder discussion required • Workforce Reports • Workforce supply/demand study
CEPI Data Quality – Overview “YOUR DATA ARE NOT NECESSARILY WRONG!” The goal of our data quality process is finding ANOMALIES, not ERRORS An ERROR is: “a deviation from accuracy or correctness” An ANOMALY is: “an odd, peculiar or strange condition, situation, quality, etc.” (definitions from Dictionary.com)
CEPI Data Quality – Applications • CEPI has several data collection applications • The Michigan Student Data System (MSDS) • Graduation and Dropout Application (GAD) • Title I Supplemental Education Services (SES) • The Financial Information Database (FID) • The Educational Entity Master (EEM) • The Registry of Educational Personnel (REP) • The School Infrastructure Database (SID) • We will be focusing primarily on the last three databases (REP, SID and EEM)
CEPI Data Quality – Collection Windows • Data are submitted for each of our CEPI Applications during Collection Windows(except the EEM, which is always open for updates) • REP has two collections per year • The End-of-Year (EOY) REP collection is open from April 1 through June 30 • The Fall REP collection is open from September 1 through the first business day in December • The SID collection is once a year from April 1 through June 30
CEPI Data Quality – Process • The data quality process is similar across the applications in the School Data Quality unit • Data Quality runs are completed at three points in the collection • Before the collection opens (pre) • During the collection (mid) • After the collection closes (post) • Started by checking 10-20 items in EOY 2007 • Expanded to over 300 in the REP collection alone for Fall 2011
CEPI Data Quality – PRE collection • Analyzes data from the PRIOR collection • Prior collection data cannot be modified in the current collection window • Identifies data elements that can be improved upon in the current collection • Each district’s authorized users are informed of the findings via e-mail shortly after the collection period opens • Identifies issues in the data structure and tables of the new collection cycle before they are an issue for the districts
CEPI Data Quality – MID collection • Snapshot of data submissions taken with about one month left in the collection window • Identifies anomalies in the current collection • Each district’s authorized users are informed of the findings via e-mail with time to modify the data before the end of the collection window • Identifies issues in the data structure and tables periodically throughout the collection period
CEPI Data Quality – POST collection • Snapshot of data submissions taken immediately after the close of the collection • Identifies anomalies in the current collection now completed • Analysis is completed in about a week • Each district’s authorized users are informed of the findings via e-mail • Data cleansing period takes place allowing the authorized users to modify their data prior to it being used for reporting
CEPI Data Quality – What are we looking for? • System edit violations or table integrity issues • Data values that are anomalies • Values outside of the expected range, but that might not be ERRORS • Values that don’t match other data • Interactions with other data collections • Issues arising out of the whole of the collection • Comparisons to prior submissions
CEPI Data Quality – System Edits • The system of validates each record as it is processed by the system • Ensure required fields are submitted • Ensure that the dependencies with other fields are followed • Most of these system edits are also built into the data quality process • Issues errors and warnings • Errors prevent the record from being saved • Warnings allow the record to be saved, but the data may need to be modified
CEPI Data Quality – System Edits • There are limitations to what the system can validate • Cannot look at the submission as a whole • Cannot look at the prior year’s submission • Cannot have exceptions to the rules • Cannot be as flexible as the data quality process • Several of the items in the Data Quality process have been turned into new system edits
School Infrastructure Database SID Data Quality
SID Data Quality – Basics • Mostly looking for outliers • Issues with Shared Space Entities • Dual Enrollment data in high schools and only in high schools • System Edit Checks
SID Data Quality – Scatter Plots Examine scatter plots of the raw number submitted and the "rate" per student reported
SID Data Quality – Scatter Plots • Identify “outliers” based on different factors • Too high of a number • A building with 4500 incidents of bullying • Too high of a rate • A building with 300 students and 450 incidents of truancy • Some incident types will flag any value reported as an outlier • Homicides • Drive-by shootings
SID Data Quality – Robbery Plot These are the lines indicating the outliers
SID Data Quality – Robbery Plot This line indicates the minimum we want to flag as an anomaly
SID Data Quality – Robbery Plot The five circled points are what have been identified as outliers and feedback will be sent on them
Report of Educational Personnel REP Data Quality
REP Data Quality – Starting out • Started looking at data using Excel and Access • Focused on rules that could not be built into the Application • Started with a dozen checks in EOY 2007 • Grew to 25 checks in Fall of 2007 • Continues to grow each collection • Examples: • Suffixes in First or Middle Name • No Title IX Coordinator Submitted • Too many classes taught by a single teacher
REP Data Quality – Name Issues • Data Quality Checks built on name fields: • Titles in name fields • First Name of “Dr. Timothy” • Last name of “Smith, DDS” • Name changes • Incorrectly submitted Suffixes • First names incorporating “To the Estate of” • Names of “Test Data” and other artificial names used for testing purposes
REP Data Quality – Date Issues • Data Quality Checks built on date fields: • Teachers that are too young • Staff members that are too old • Staff members that are hired too young • Enforcing the order of dates • Birth Date < Hire Date < Termination Date • Terminated records without a valid termination date • Credential Date issues
REP Data Quality – Title IX Issues • Data Quality Checks built on Title IX Coordinator submissions: • No Title IX coordinator Submitted • Title IX coordinator submitted with a full FTE • Title IX coordinator submitted with a terminated status and no other staff member assigned to that position • Have developed over time
REP Data Quality – Current State • For Fall 2011: • Over 300 Checks were run • Districts were notified about 48 different issues • 1381 messages were sent out • 1058 different users of 540 districts received data quality feedback
REP Data Quality – Near Future • Data Quality Checks are being added and improved • Looking improving the following issues: • Grade-Levels of Students submitted in MSDS • Accounting Function Codes and their use in the FID • Data contained in the Michigan Online Educator Certification System (MOECS) • Teacher-Student Data Link (TSDL) related issues
Educational Entity Master EEM Data Quality
EEM Data Quality – Differences • EEM is different from the other collections in that it does not have a window • Data quality is ongoing and periodic • Often checking for data that is not in the correct format • A starting point for using our data profiling tools
EEM Data Quality – Sample Issues • Issues between EEM and other applications • Grades for a student or teacher • Educational Settings • Lead Administrator issues • System edits working • Physical Addresses that do not exist • Data profiling has allowed us to find issues in the contents of the data where they might not be in a consistent form
EEM Data Quality – Profiling Finds • Fields that contain both the descriptive value and the code value in the same field • County records that contain both “Wayne” and “81” referring to the same thing • Leading zeros or spaces in a text field • State entries of “_ _ _ _ MI” • Congressional Districts of “1” “01” and “001” • Zip Code formatting • Zip+four containing the dash or not? • Capitalization inconsistencies
Questions and Answers CEPI Data Quality