GOT DATA?Step-by-Step Guide to Making Data Work for You Center for Applied Research Solutions, Inc 771 Oak Avenue Parkway, Suite 3 Folsom, CA 95630(916) 983-9506 TEL (916) 983-5738 FAX
GOT DATA?Step-by-Step Guide to Making Data Work for You Facilitators: Kerrilyn Scott Christina Borbely Produced and Conducted by the Center for Applied Research Solutions, Inc. for the California Department of Alcohol and Drug Programs SDFSC Workshop-by-Request May 16, 2005 Authored by Christina J. Borbely, Ph.D. Safe and Drug Free Schools and Communities Technical Assistance Project
Objectives • Preparing to Use Data • Database options & structure • Identifying data • Coding & Entering • Storing & Cleaning • Methods for Summarizing Data • Basics: frequency & % change • Beyond Basics: mean scores; making comparisons • Interpreting Data • Effective Report Writing • Utilizing & Disseminating Findings • Program improvement, Funders, Key Stakeholders
Ready, Set, Go!Preparing to Use Data • Database Options • Identifying Data • Coding Data • Entering, Storing, & Cleaning Data
Database Options • Microsoft Excel • Microsoft Access • SPSS
Excel • Spreadsheet format • Some computational functions • Compatible with other MS software & statistical software • Comes with Microsoft Office package (or $299) http://office.microsoft.com/en-us/FX010858001033.aspx
Access • User friendly design • Requires some preparation prior to data entry • Generates custom reports • Good for qualitative (i.e. open-ended items) & quantitative data • Compatible with other Microsoft software & statistical software (i.e. converts easily to Excel!) • Comes with Microsoft Office package (or $299) http://office.microsoft.com/en-us/FX010857911033.aspx
SPSS • Spreadsheet format • Requires some tutorial (not always intuitive) • One-touch data analysis! • Pricing ranges from $599 to $1499 www.spss.com
Watcha Got? • Identifying data • Variable names
Identifying Data • Each piece of information you have for a participant or a program is data. Data are… # of completed surveys data # of times a youth attended a session # of youth who attended a meeting # of merchants contacted for outreach Age Grade
FYI: Types of Data • Discrete, categorical Male/Female US Citizen/Non US Citizen Freshman, Sophmore, Junior, Senior • Continuous Age Salary Conflict Resolution Ability
Variable Names • Each piece of data is labeled with a unique (and hopefully meaningful) variable name. Data Variable Name Section E, item 3 E3 Age Age Unit 1 total score Un1tot
Variable Names: Do’s & Don’ts • Meaningful For section E, item 6: E6 Variable124a • Short DOB Date of Birth E6 Youth Survey Section E, Item #6 • Systematic E6, E7, E9, F1, F2 1F, twoF, Fthree
Plan to reference data collection time points First administration: BL (for baseline) or T1 (for time 1) or PRE (for pre-test) BLE6, FUE6 E6, E6 • Be consistent with the chosen system T1E6, T2E6 E6T1, T2E6
Coding Key: Do’s • Translate into numeric values For response scale: YES! Yes No NO! YES! = 3 Yes = 2 No = 1 NO! = 0 • Record coding key directly onto measure & save!
Coding System Examples Race Black = 1 Hispanic = 2 White = 3 Asian = 4 Other = 5 Gender Male = 1 Female = 2
Coding Key: Don’ts • Do not create a separate variable to code each response to an item. 1. What grade are you in? A. 6th B. 7th C. 8th Variable name = BL1 Codes A=1; B=2; C=3 NOT Variable name = BL1A; BL1B; BL1C Codes Yes=1; No=0
Advanced Coding • Collapsing Variables by Code • Variable Name: “Reside” • Codes: house = 1 • apartment = 2 • barn = 3 1. Do you live in a house? Y/N 2. Do you live in an apartment? Y/N 3. Do you live in a barn? Y/N
Reverse Coding The values of the coding system may need to be reversed to reflect the true meaning of the response. 1. Do you runaway from home? Often Sometimes Rarely Never 2. Do your parents smile at you? Often Sometimes Rarely Never 3. Are you happy at home? Often Sometimes Rarely Never Variable codes: 4 3 2 1 Reverse code: 1 2 3 4
Entering Data in Your Database • Create 1 row of variable names: Across • Create 1 column of names/id #s: Down • Enter post test & follow-ups by extending the row for each participant ID BLgrade BLa23 T2grade T2a23 0025 6 2.5 7 3.1 • Save regularly as you enter (don’t lose all that work!)
Storing Data • Hardcopies • Electronic files
Under Lock n’ Key • Guard with your life until a back up is made • Keep all hardcopies as backup • Maintain back ups in different locations • Preserve confidentiality • Separate identifying information from surveys • Use passwords; locked file cabinets; secured offices
Cleaning Data: Quick, Easy, & Worth It! Save yourself the grief of inexplicable scores… • Data should fall within an expected range (e.g. 1 to 5). Scan data for unusual numbers by: • Visual review • A “sort by” function • A “find” function • A “minimum/maximum” or “range” function
Squeaky Clean! • Use a “missing” marker (e.g. 999) when a response is purposely missing (e.g. left blank, etc.) Pros: easy to spot unintentionally unentered data Cons: extra step to remove missing marker later • Don’t forget to exclude “missing” data values, so it doesn’t mess up your computations!
FYI:How to use “missing” markers • Select number or symbol that will not naturally occur in the data • Enter marker when data point is unavailable • Clean data & look for “blanks”. Fill in un-entered or incomplete data. • After data is clean, delete or exclude the missing marker • Do data analysis
Recommendations • Consider using “in house” resources for entering & cleaning data • Consider outsourcing database development to a graduate student or local evaluator
FYI:Outliers • An outlier is a data point that does not cluster with other data points in the group. • Example: ages range from 12.1 to 14.3 years; there are 3 outliers age 17.4; 19.2; and 19.7 years. • It may skew data so that it is not representative of the sample. • Consider excluding outliers
Guide: Step 1 • Set up a database • Code and enter data • Clean database Kids today!
Putting Data to WorkMethods for Summarizing Data Basics Taking It Up a Notch
Add It Up • Count or Tally Do you attend Club Live? Yes No By hand By computer Yes=1; No=0; Blank=999
What race are you? Black White Asian Hispanic Other Frequencies: Ratio & Percent Distribution Quantifies rate of occurrence for categories of information Useful for…. Do you live with both biological parents? Yes No NOT As Useful for…. How much you like school? (circle one) YES! Yes No NO! How old are you? _____
Calculating Frequency • Sum the number of times a given response occurs • Report a number: a ratio or percentage Gender # of participants % of participants Male 49 49% Female 5151% Total 100 100% • Of the 100 participants, 49 were male. • This year, almost half (51%) of the participants were females.
Common uses • Demographics to characterize participants or community Race; gender; grade; homeowner status • Statistics to describe program Number of program completers % of city council members contacted • Impact statements on outcomes % of youth reporting ATOD use Ratio of signage below adult eye-level
Reporting Frequencies Frequency of participants reporting they are: Male Employed Getting mostly B’s in math Parents of a FNL youth Frequency with which: Decoy buys are successful Alcohol-sponsored events occur
Sample:Excerpt of Frequency in Text “Of clients with completed CBCL/YSR, well over half (56.9%) function in the lowest quartile of global competence. Specifically, clients demonstrate compromised ability related to engagement in age-appropriate activities, social interaction, and performance at school. Given that services are provided in the school context, it is not surprising that almost three-quarters of the clients (71.2%) function in the bottom quartile of school-related competence. Teachers and other school staff, individuals familiar with indicators of school competence, are the most common referral source of students. It is expected that competence in these domains will benefit from student participation in counseling services. Additional data is being collected to test for improvement over time.”
Change Score • Comparison of scores to assess change Proposed outcome: 80% of youth increased awareness of ATOD consequences 5 of 7 youth increased scores = 71.4% of youth increased awareness of ATOD consequences
Taking It Up a Notch • Mean scores • And beyond…
Mean Scores • The mean refers to a variables central tendency and is the sum of all a factors values divided by the number of values. • “Mean” and “average” refer to the same concept.
Calculating Means • Sum all the response values, then divide by the total number (of responses or items) • Provide a frame of reference (“out of how many”)
Averages ID Age ItemE7 RskFctrs aj785 20 4 3 tk983 22 3 0 mr286 195 2 61/3 = 20.3 12/3 = 4 5/3 = 1.6 • The mean age of the participants is 20.3 years. • The average score on Item E7 is 4 out of 5. • Youth have an average of 1.6 risk factors out of a possible 4 risk factors.
Common Uses • To make a generalized statement about a group. • Demographics to characterize participants or community Age; Income level • Impact statements on outcomes Level of ATOD use among youth Sub-scale scores
Reporting Mean Scores • Report means of sub-scales Average score for “Community Connection” scale • Report mean scores of an individual item Item E4: How often did you smoke pot in the past 7 days? • Report mean score of occurrence Average number of hours spent educating merchants
Sample:Excerpt of Mean Score in Text “Of the districts completing Year 1 Superintendent Surveys, the majority indicated that counseling services were of a resource of high value. On a five-point scale with 5 being the highest value, the average value assigned to the Project X counseling services was 3.67. In addition, all districts indicated that parents, teachers, administrators, and school psychologists were largely receptive to and supportive of the resource. The majority of responding superintendents indicate that districts would benefit from expanding counseling services and improving the physical space allotted for service delivery. Clearly, Year 1 has culminated in substantiated need and the resolve to prioritize addressing the need. “