Chapter 6 Statistical Concepts

Chapter 6 Statistical Concepts Research Methods in Physical Activity

Statistics • Statistics is simply an objective means of interpreting a collection of observations. • Various statistical techniques are necessary to describe the characteristics of data, test relationships between sets of data, and test the differences among sets of data. • Types of Statistics • Descriptive • example: Mean - A statistical measure of central tendency that is the average score of a group of scores. • Associations • example: Pearson product moment coefficient of correlation—The most commonly used method of computing correlation between two variables; also called interclass correlation, simple correlation, or Pearson r. • Differences • example: t test—A statistical technique to assess differences between two groups. Research Methods in Physical Activity

Selecting Research Samples • The sample is the group of participants, treatments, and situations on which the study is conducted. Random Selection The sample of participants might be randomly selected from some larger group, or a population (the larger group from which a sample is taken.) Potential numbered subject pools could be selected form a list using random numbers table (A table in which numbers are arranged in two-digit (or greater) sets so that any combination of rows or columns is unrelated - see Table 1 in Appendix) • Stratified Random Sampling • In stratified random sampling, the population is divided (stratified) on some characteristic before random selection of the sample. • example : the selection of 200 students from a population of 10,000; with 30% freshmen, 30% sophomores, 20% juniors, and 20% seniors. Stratify on class before random selection to make sure that the sample was exact in terms of class representation. Here, you would randomly select 60 students from the 3,000 freshmen, 60 from the 3,000 sophomores, 40 from the 2,000 juniors, and 40 from the 2,000 seniors. This procedure still yields a total sample of 200. Research Methods in Physical Activity

Selecting Research Samples Systematic Sampling If the population from which the sample is to be selected is very large, assigning a numeric ID to each potential participant is time consuming. Example : Suppose that you want to sample a town with a population of 50,000 concerning the need for new sport facilities. One approach would be to use systematic sampling from the telephone book. You might decide to call a sample of 500 people. To do so, you would select every 100th name in the phone book (50,000/500 = 100) Research Methods in Physical Activity

Random Assignment In experimental research, groups are formed within the sample. The issue here is not how the sample is selected but how the groups are formed within the sample. All trueexperimental designs require that the groups within the sample be randomly assigned or randomized. (this will be discussed in greater detail in Chapter 18) A good sample leads to a generalization statement that states it is plausible the findings apply to a broader population. This process allows the researcher to assume that the groups are equivalent at the beginning of the experiment, which is one of several important features of good experimental design that is intended to establish cause and effect. Post Hoc Explanations Frequently, the sample for research is not randomly selected; rather, the researcher attempts a post hoc justification that the sample represents some larger group. A post hoc attempt at generalization may be better than nothing, but it is not the equivalent of random selection, which allows the assumption that the sample does not differ from the population on the characteristics measured. Research Methods in Physical Activity

Unit of Analysis • Unit of analysis — The concept, related to sampling and statistical analysis, that refers to what is considered the most basic unit from which data can be produced. This concept refers to what can be considered the most basic unit from which data can be produced. • Example : Fitness Categories 1) High Fitness, 2) Moderate Fitness, 3) Low Fitness Groups (See also example in book, p 104-105) • Any of the following could be a unit of analysis in a study: • individuals • groups • artifacts (books, photos, newspapers) • geographical units (town, census tract, state) • social interactions (dyadic relations, divorces, arrests) • The Unit of Analysis can significantly affect the sample size. Research Methods in Physical Activity

Measures of Central Tendency and Variability Central tendency (measure of) — A single score that best represents all the scores. When you have a group of scores, one number may be used to represent the group. That number is generally the mean, median, or mode. These terms are ways of expressing central tendency. Within the group of scores, each individual score differs to some degree from the central tendency score. The degree of difference is the variability of the score. Thus, you can have variability within a group of scores (within group variance) , and/or variability between groups (between group variance) Between group variability and within group variability are both components of the total variability in the combined distributions. What we are doing when we compute between and within variability is to partition the total variability into the between and within components. So: Between variability + within variability = total variability Variability can be estimated with the standard deviation (see later slide) Research Methods in Physical Activity

Mean: The mean is the average score in a group of scores. Example: (M), or average: M = ΣX/N (sum of scores divided by the number of scores) Median — A statistical measure of central tendency that is the middle score in a group. The median is defined as the value in the middle; the middle value is the value that occurs in the place (N + 1)/2 when the values are put in order. The median score may be used because the mean may not be the most representative or characteristic score, especially if there is an non-representative (outlier) score in the distribution of scores. Mode — A statistical measure of central tendency that is the most frequently occurring score of the group. Mode scores are helpful if you are looking for the most frequented response or score in a distribution of scores. Research Methods in Physical Activity

Variability Scores Variability — The degree of difference between each individual score and the central tendency score. An estimate of the variability, or spread, of the scores can be calculated as the standard deviation. ( see Table 6.1, p106, for an example of how to calculate the standard deviation) The square of the standard deviation is called the variance, or s2 Table 6.1, contains the scores, deviation scores, and squared deviation scores. The mean and standard deviation together are good descriptions of a set of scores. If the standard deviation is large, the mean may not be a good representation. Roughly 68% of a set of scores fall within ±1s, about 95% of the scores fall within ±2s, and about 99% of the scores fall within ±3s. This distribution of scores is called a normal distribution Research Methods in Physical Activity

Range of Scores • Sometimes the range of scores (highest and lowest) may also be reported, particularly when the median is used rather than the mean. The RANGE is the difference between the lowest and highest values. ( high score – low score) • Confidence intervals (CI) • CI’s should be used because statistics vary in how well they represent target populations. • A CI provides an expected upper and lower limit for a statistic at a specified probability level, usually either 95% or 99%. • CI’s are based on the fact that any statistic possesses sampling error. This error relates to how well the statistic represents the target population. • When we compute a mean for a sample, we are making an estimate of the mean of the target population. A CI provides a band within which the estimate of the population mean is likely to fall instead of a single point (review example on p 107-108, text) Research Methods in Physical Activity

Frequency Distribution and the Stem-and-Leaf Display (see figure 6.2, p 108) • A common technique for summarizing data is to produce a picture (called a histogram) of the distribution of scores by means of a frequency distribution. • Frequency distribution - A distribution of scores including the frequency with which they occur. • If there is a wide range of values, a grouped frequency distribution is used in which scores are grouped into small ranges called frequency intervals. • Frequency intervals - Small ranges of scores within a frequency distribution into which scores are grouped. • One drawback to a grouped frequency distribution is that information is lost; that is, a reader does not know the exact score of each individual within a given interval. Thus, Stem and leaf designs are helpful in understanding both the shape of the distribution and the exact scores. Research Methods in Physical Activity

Frequency Distribution and the Stem-and-Leaf Display • Stem-and-leaf display — A method of organizing raw scores by which score intervals are shown on the left side of a vertical line and individual scores falling into each interval are shown on the right side. This representation is similar to a grouped frequency distribution, but no information is lost. Research Methods in Physical Activity

Categories of Statistical Tests • The two general categories of statistical tests are parametricand nonparametric. Using the various tests in each category requires meeting the assumptions for those tests. The first category, parametric statistical tests, has three assumptions about the distribution of the data: • ♦ The population from which the sample is drawn is normally distributed on the variable of interest. • ♦ The samples drawn from a population have the same variances on the variable of interest. • ♦ The observations are independent. • The second category, nonparametric statistics, is called distribution free because the previous assumptions need not be met. Research Methods in Physical Activity

Parametric Statistics Whenever the assumptions are met, parametric statistics are often said to have more power. Having more power increases the chances of rejecting a false null hypothesis. The assumptions can be tested by using estimates of skewnessand kurtosis. Normal distributions of scores will resemble the normal curve. Skewness — Description of the direction of the hump of the curve of the data distribution and the nature of the tails of the curve. Kurtosis — Description of the vertical characteristic of the curve showing the data distribution, such as whether the curve is more peaked or flatter than the normal curve. Normal curve — Distribution of data in which the mean, median, and mode are at the same point (center of the distribution) and in which ±1s from the mean includes 68% of the scores, ±2s from the mean includes 95% of the scores, and ±3s includes 99% of the scores. Research Methods in Physical Activity

Skewness and Kurtosis (Figure 6.3-6.5, p110, text) Skewness of the distribution describes the direction of the hump of the curve (labeled A in figure 6.4) and the nature of the tails of the curve (labeled B and C). If the hump (A) is shifted to the left and the long tail (B) to the right (figure 6.4a), the skewness is positive. If the shift of the hump (A) is to the right and the long tail (C) is to the left (figure 6.4b), the skewness is negative. Kurtosis describes the vertical aspect of the curve, such as whether the curve is more or less peaked than the normal curve. Figure 6.5a shows a more peaked curve, and figure 6.5b shows a flatter curve. Research Methods in Physical Activity

Looking Ahead… • What statistical techniques tell us…. • Reliability (significance) of effect • Strength of the relationship (meaningfulness) • Types of statistical techniques… • Relationships among groups (units) • Differences among groups (units) END OF PRESENTATION Research Methods in Physical Activity

Chapter 6 Statistical Concepts

Chapter 6 Statistical Concepts

Presentation Transcript

Chapter 6: Basic Motivation Concepts

Chapter 6 Statistical Thermodynamics

Basic Statistical Concepts

Layer 2: Concepts Chapter 6

Statistical Process Control Concepts

Statistical concepts

Statistical concepts

Chapter 6 / Supplemental Statistical Models

Chapter 6 Becoming Acquainted With Statistical Concepts

Statistical Concepts Basic Principles

Basic Statistical Concepts

Basic Statistical Concepts

Chapter 6 – Layer 2 Concepts

BASIC STATISTICAL CONCEPTS

Basic Statistical Concepts

Chapter 19 Statistical thermodynamics: the concepts

Basic Statistical Concepts

Statistical concepts

Chapter 6 - Statistical Quality Control

Basic statistical concepts

Chapter 6: Basic Motivation Concepts

Basic statistical concepts