1 / 88

Chapter 15 Quantifying Data

Chapter 15 Quantifying Data. Coding. computers must be able to read the data you’ve collected in your research that means using numbers translate by coding the response. General Guideline. code to maintain a great deal of detail

Télécharger la présentation

Chapter 15 Quantifying Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 15Quantifying Data

  2. Coding • computers must be able to read the data you’ve collected in your research • that means using numbers • translate by coding the response

  3. General Guideline • code to maintain a great deal of detail • if the data are coded into relatively few, gross categories, however, there’s no way going back to the original data • code your data in more detail than you plan to use in the analysis

  4. Developing Code Categories • code categories should be both exhaustive and mutually exclusive • make it simple • make sure it relates to the question for easy comprehension • Eg pg 401

  5. Codebook Construction • codebook is a document that describes the locations of variables and lists the assignments of codes to the attributes composing those variables

  6. Codebook • tells you where to find the variables and what the codes represent • every codebook should contain the full definition of the variable • in the case of a questionnaire, the definition would be the exact wording of the questions asked • indicates the attributes composing each variable • Eg. Pg 403

  7. Coding and Data Entry Options • transfer sheets –coding the data and transferring the code assignments to transfer sheet or code sheet • edge-coding– the outside margin of each page of a questionnaire or other data source document is left blank or is marked with spaces corresponding to variable names or numbers

  8. direct data entry – enter data directly into the computer without using separate code sheets or even edge-coding • coding to optical scan sheets – scantron

  9. Data Cleaning • the process of detecting and correcting coding errors • errors result from incorrect coding, incorrect reading of written codes, incorrect sensing of blackened marks, and so forth

  10. Possible-code cleaning • process of checking to see that only the codes assigned to particular attributes appear in the data files

  11. Contingency cleaning • concerns checking that only those cases that should have data entered for a particular variable do in fact have such data

  12. Statistics:Descriptive and Inferential

  13. Data is numerical

  14. Statistics • A set of mathematical techniques used by social scientists to organize and manipulate data for the purpose of answering questions and testing theories

  15. Variable • Any trait that can change values from case to case

  16. Independent Variable Dependent Variable

  17. Descriptive Statistics • When the researcher needs to summarize or describe the distribution of a single variable • When the researcher wishes to understand the relationship between two or more variables

  18. Data Reduction • Process of allowing a few numbers to summarize many numbers • Is the basic goal of single-variable (or univariate) descriptive statistical procedures

  19. Measures of Association • Understand the relationship between two or more variables • Allow to quantify the strength and direction of a relationship

  20. Inferential Statistics • Wish to generalize findings from a sample to a population • Involves using information from samples (carefully chosen subsets of the defined populations) to make inferences about populations

  21. Level of Measurement

  22. Nominal • Classification into categories is the only measurement procedure permitted • Not numerical • Compared to each other only in terms of the number of cases classified in them

  23. Nominal • Categories not higher or lower along some numerical scale • Mutually exclusive and exhaustive • Categories are relatively homogeneous

  24. Ordinal • Classify into categories • Allow categories to be ranked • High to low, more or less than another • Eg. SES: upper class, middle class, working class, lower class

  25. Ordinal • Represents only a position • The distance between the scores cannot be described in precise terms • Averages cannot be used

  26. Interval-Ratio (I/R) • They are measured in units that have equal intervals • Eg. age, # of siblings • Have a true zero • Zero indicates the absence or complete lack of whatever is being measured • Eg. education, income

  27. Levels of measurement is the first guideline to use in selecting a statistic

  28. Exercise:Levels of Measurement a. What is your occupation? b. How many years of school have you completed? c. If you were asked to use one of these four names for your social class, which would you say you belonged in? upper middle working lower d. What is your age?

  29. Exercise:Levels of Measurement e. In what country were you born? f. What is your grade-point average? g. What is your major? h. The only way to deal with the drug problem is to legalize all drug. Strongly Agree Agree Undecided Disagree Strongly Disagree i. What is your astrological sign? j. How many brothers and sisters do you have?

  30. Basic Descriptive StatisticsPercentages, Ratios and Rates, Tables, Charts, and Graphs

  31. Descriptive Stats • present research results clearly and concisely • data reduction -> using few numbers, a table, or a graphic device to summarize or stand for a larger array of data

  32. Percentages and ProportionsEquations: Proportion (p): f N Percentage (%):( f) x 100 N • f = frequency, the # of cases in any category • N = number of cases in all categories

  33. Disposition of 269 criminal cases ___________________________________________ Sentence Frequency Proportion Percentage (f) (p) (%) ___________________________________________ 5 yrs or more 167 .6208 62.08 < 5 yrs 72 .2677 26.77 Suspended 20 .0744 7.44 Acquitted 10 .0372 3.72 ___________________________________________ N= 269 1.001 100.01

  34. Look at 1st category: 167 cases in the category (f=167) Total number of cases in all N=269 %=(f)x100=(167)x100 N 269 = (.6208)x100 = 62.08 %

  35. Guidelines for percentages and proportions • when dealing with a small number of cases (20) better to use frequencies • always report the number of observations along with proportions and percentages • used for any level of measurement

  36. Ratios • ratios useful for comparing categories in terms of relative frequency • dividing the frequency of one category by the frequency of another

  37. Ratio equation: f1 f1=# of cases of 1st category f2 f2=# of cases in the 2nd category

  38. Example:Comparing Catholic to Protestant Families Protestant N=147 Catholic N=100 Ratio=147 100 =1.47 Ratio is 1.47:1 or 147:100 Therefore, for every 100 Catholic families, there are 147 Protestant families

  39. Rates • the number of actual occurrences of some phenomena divided by the number of possible occurrences per some unit of time • usually multiplied by some power of 10 to get rid of decimal point

  40. Example: Crude death rate: The number of death’s in population (actual occurrences) divided by the # of people in the population (possible occurrences) per year, then multiply by 1000 CDR: # of deaths in yr x 1000 total population

  41. Example If 100 deaths during a given year ina town of 7000 what is the CDR CDR=100 x 1000 7000 =(.01429)x1000 =14.29 Therefore, for every 1000 people, there are 14.29 deaths. Especially useful in making comparisons between different groups and/or different times

  42. Frequency Distributions • tables that summarize the distribution of a variable by reporting the # of cases contained in each category of the variable • must be exhaustive and mutually exclusive

  43. Nominal - straightforward Table 1 Sex of Respondents _______________________________________________________________ Sex Tallies Frequency (f) Male IIIII IIIII 10 Female IIIII IIIII 10 ______ N= 20 ___________________________________________ N=total number of cases in sample Specifics about a table: • descriptive title • clearly labeled categories (male and female) • total # of cases at the bottom of the frequency column (always include)

  44. Ordinal • same as nominal Table 2 Satisfaction with services _______________________________________________________________ Satisfaction Frequency Percentage ___________________________________________ (4)Very satisfied 4 20 (3)Satisfied 9 45 (2)dissatisfied 4 20 (1)very dissatisfied 3 15 ________ ________ N= 20 100 ___________________________________________ N= total number of cases

  45. Interval/Ratio Basic Considerations • more complex for f table • large number of possible cases • large number of scores requires some collapsing or grouping of categories to produce reasonably compact frequency distributions • need to decide how may categories to use and how wide these categories should be

  46. Table 3 Age of Respondents ____________________________ Age Frequency ____________________________ 18 5 19 6 20 3 21 2 22 1 23 1 24 1 25 0 26 1 ________ N= 20 ____________________________

  47. Table 4 Age of Respondents __________________________________________ __________________________________________ Age Frequency Percentage __________________________________________ 18-19 11 55 20-21 5 25 22-23 2 10 24-25 1 5 26-27 1 5 ________ ________ N= 20 100 _________________________________________ Can you a decimal point for precision No overlapping is allowed

  48. Cumulative Frequency • - will be all cases in the interval plus all cases in the 1st and 2nd intervals Cumulative Percentage • - same addition pattern as frequency Make a point as to how the cases are spread across the range of scores

  49. Example Variable (f) Cum (f) % Cum % ___________________________________________ 18-19 11 11 55 55 20-21 5 16 25 80 22-23 2 18 10 90 24-25 1 19 5 95 26-27 1 20 5 100 N= 20 100

  50. Charts and Graphs

More Related