Forms to Spreadsheets
E N D
Presentation Transcript
Forms to Spreadsheets A-Team Spring Brown Bags February 7, 2014 Jennifer Lowman Coordinator, Student Persistence Research University of Nevada, Reno
Outline • Excel Basics • Form Preparation • Naming Variables • Coding Data • Entering Data • Cleaning Data • Analyzing Data
Excel Basics • Workbooks & Worksheets • 3 Parts to a Spreadsheet • Rows (numbered) • Columns (alphabetized) • Cells (combo, H7)
Excel Basics • Row Cases • Columns Variables • Cells Data
Form Preparation • Case IDs • Variable Names • Data Codes Case IDs & Variable Names are Unique Identifiers It is the combination of the two that make your data meaningful.
Case IDs • Best Practices for Case IDs • Unique, Meaningful, Confidential, & Stable • Put the Case ID on every form • Major considerations • How many times are you collecting data from each person? • Do you need institutional data? • Do you need to protect confidentiality? • Employees v. Participants (mandatory v. voluntary) • Are any of the data sensitive?
Rules of Thumb • Once, with no need for institutional data • Sequential numbers, with random start (randomize forms before numbering) • More than once, no need for institutional data? • Use something meaningful to respondent • Sample size may challenge uniqueness • Need institutional data? • Use something meaningful to you
Trade-Offs • Meaningful for participant • Easy to remember (stable) • Might not be confidential • May or may not link to institutional data • Meaningful to you • Not easy to remember (not stable) • Promotes confidentiality • May need a key, risks to confidentiality • Promotes link to institutional data
Put the Case ID on Everything • Every Form, Every page • Double Check • Back Track • Multiple Coders
Naming Variables • Best Practices for Variable Names • Unique & Meaningful Abbreviations • Short Standard 8 characters • Excel can handle more, but your column size will increase • Start with letter, not # • Mnemonic strategy vs. Question Number • workhrs vs. Question1 (q001) • Mnemonic, one-time projects, with one person handling data • Question Numbers, repeated or large projects, multiple people handling the data
I use both • Meaningful Abbreviations • Less meaningful… question1 or q001 • More meaningful… q1reshall • Avoid generic, be specific • What can you expect to find for q1reshall? • Names of residence halls (Nye, Lincoln, White Pine…) • Codes for residence halls (1 = Nye, 2 = Lincoln…) • Lives in a residence hall (0 = no, 1 = yes)
Meaningfulness is tied to your coding!! • 1’s and 0’s 0 = no, does not have characteristic 1 = yes, has the characteristic • sex vs. female • what does a “0” mean? • what does a “1” mean? • race vs. white?
Coding Data • Categorical Data • Two Categories, use 0’s and 1’s, variable name should be your reference group • Three or more categories • Nominal - No meaningful numerical difference between categ. • “dummy code,” instead of one variable “race,” make five variables 0 = not Asian, 1 = Asian 0 = not Black, 1 = Black 0 = not Hispanic, 1 = Hispanic 0 = not Native American, 1 = Native American 0 = not White, 1 = White
Coding Data (cont.) • Categorical Data • Three or more categories with a meaningful, numerical difference between categories • Academic Level • 1 = Freshman • 2 = Sophomore • 3 = Junior • 4 = Senior • 5 = Second Degree • 6 = Masters Student • 7 = PhD or Professional Medical • 10 = 0 = 13 (years) • 20 = 16 = 14 • 30 = 33 = 15 • 40 = 50 = 16.4 • 50 = 66 = 17 • 60 = 83 = 18 • 70 = 100 = 22
Coding Data • Many types of coding you do when you create your survey • How committed are you to Nevada? • 1 = not committed at all … 7 = Extremely committed • Even if it is not perfect, enter that information in your spreadsheet • Then RECODE it into a NEW VARIABLE • Never throw information out • Always have a system to check your codes • Enter “Race” Then Recode (Dummy Code)
Entering Data • Enter it exactly • Recode anything that can be “quantified” into new variables • Missing Data • Leave it Blank • If you must, use an extreme number, something way out of range (-999)
Qualitative Data • Content Analysis (Implicit Quantification) • Identify themes, categories, patterns • Start Broad • Get multiple perspectives • Narrow it down to a manageable number of themes • Count
Enough Talk Let’s Do It