Definitions • Data facts used to draw conclusions Example: observations (such as measurements, genders, survey responses) that have been collected.
Definitions • Statistics the science of collecting, organizing, summarizing, & analyzing info. to draw conclusions (answers to questions).
Definitions • Population the complete collection of ALL elements (scores, people, measurements, etc.) to be studied.
Definitions • Census the collection of data from EVERY member of the population. • Sample a sub-collection of elements drawn from a population.
Key Concepts • Sample data must be collected in an appropriate way, such as through a process of random selection. • If sample data are not collected in an appropriate way, the data may be so completely useless that no amount of statistical torturing can salvage them.
Definitions • Parameter a numerical measurement describing some characteristic of a population population parameter
sample statistic Definitions • Statistic • a numerical measurement describing some characteristic of a sample.
Definitions • Quantitative data • numbers representing counts or measurements. • Example: weights, age, heights
Working with Quantitative Data Quantitative data can further be distinguished between discrete and continuous types.
Definitions • Discrete data that can be counted using whole numbers (not decimals or fractions). 0, 1, 2, 3, . . . Example: Number of teeth, number of eggs.
Definitions • Continuous (numerical) data that can be measured, or covers a range of values without gaps (includes decimal and fraction numbers). Example: height, weight, age
Definitions • Qualitative (also called categorical or attribute) data • can be separated into different categories that are distinguished by some nonnumeric characteristics. • Example: genders (male/female), favorite color (red, blue, green), favorite foods
Recap Basic definitions and terms describing data • Parameters versus statistics • Types of data (quantitative and qualitative)
Major Points • We collect sample data in order to make a prediction about an entire population. • We would not bother to collect sample data if we had the ability, time, and money to gather information from an entire population. • If sample data are not collected in an appropriate way, the data may be so completely useless that no amount of statistical tutoring can salvage them.
Important Points • Statisticians make decisions based on data. Data production helps us answer specific questions with an experiment or an observational study. • Do you know the DIFFERENCE between an observational study and an experiment?
Definitions • Observational Study • observing and measuring specific characteristics without attempting to modify (change) the subjects being studied • Example: Survey a group of people
Definitions • Experiment • when we apply some treatment (do something to) the subjects and then observe its effects on them • Example: give plants different fertilizers to see which fertilizer works the best.
Definitions • Confounding • occurs in an experiment when the experimenter is not able to distinguish between the effects of different factors (treatments) • Example: Did the plants grow larger due to the fertilizer, or did they simple receive more water and sunshine than the other plants? A Statistician’s job is to try to design an experiment so confounding does not occur!
Definitions • Data Set • contains information on a number of individuals. • Individuals may be people, animals, or things. • Variables describe some characteristic of an individual, such as a person’s height, gender, salary. • Some variables are categorical and some are quantitative.
Definitions • Distribution • In statistics, we often talk about (or describe) the variables (or distribution of our data). • We do this using graphs • (bar graphs, histograms, box plots, etc.) • Graphs can be used to easily describe the mean, median, mode, range, and so forth of the data distribution.
Definitions • Probability • the chance (likelihood) of a • particular outcome. • Statistical inference • produces answers to specific questions, along with a statement about how confident we can be that the answer is correct.
Types of Studies • Cross Sectional Study Data are observed, measured, and collected at one point in time. • Retrospective (or Case Control) Study Data are collected from the past by going back in time. • Prospective (or Longitudinal or Cohort) Study Data are collected in the future from groups (called cohorts) sharing common factors.
Sampling Methods • When conducting studies (observational or experimental), sample data must be collected. • There are good and bad sampling methods depending on the type of data you need to collect.
Definitions • Random Sample • members of the population are selected in such a way that each individual member has an equal chance of being selected • Simple Random Sample (of size n) • subjects selected in such a way that everypossiblesample of the same size n has the same chance of being chosen
Random Sampling selection so that each has an equalchance of being selected Examples: Draw name out of a hat, GA lottery “numbered-ball air machine”, SRS Table in textbook, Random Integer Function on calculator.
SRS and Random # Table • In order to have students get the same results in a SRS, questions ask students to use a random number table
Systematic Sampling Select some starting point and then select every K th element in the population Example: Selecting every 3rd person in line, selecting those sitting on every other row.
Stratified Sampling subdivide the population into at least two different subgroups that share the same characteristics, then draw a random sample from each subgroup (or stratum) Example: Divide population into 2 groups (male, female), select same amount from each group. - 4 groups (9th, 10th, 11th, 12th grade), select 10%(same proportion) from each of the 4 groups.
Cluster Sampling divide the population into sections (or clusters); randomly select some of those clusters; choose all members from selected clusters
Convenience Sampling use results that are easy to get Example: Sample the first 10 people who enter a room. * Mall Surveys
Voluntary (Self-Selected) Sampling Example: Volunteer to answer a survey online, in a magazine, over the phone, etc.
Methods of Sampling • Random (SRS) • Systematic • Stratified • Cluster • Convenience • Voluntary The first four sampling methods are PROBABILITY SAMPLES meaning the samples were chosen by chance. SRS gives each member of the population an equal chance of being selected, this may not be true in more elaborate sampling methods. Convenience & Voluntary sampling methods are types of BAD sampling methods because they are generally BIASED (meaning they systematically favor certain outcomes).
Methods of Sampling • Another sampling method which you need to be familiar is Multi-stagesampling. • Just like its name indicates with multi-stage you select successively smaller groups within a population in stages. • Each stage may employ a different sampling method.
Definitions • Sampling Error (OK ) the difference between a sample result and the true population result; such an error results from chance sample fluctuations. Can’t be helped, you can never predict an outcome with 100% certainty using a sample. • Nonsampling Error (BAD ) sample data that are incorrectly collected, recorded, or analyzed (such as by selecting a biased sample, using a defective instrument, or copying the data incorrectly)
CAUTION: Sample Surveys Random selection eliminates bias in the choice of a sample. However, you still have to WATCH OUT FOR… • Undercoverage when some group(s) of the population are ‘left out’ Example: phone survey (those without phones are left out), which may mean the economically disadvantaged are under represented in the outcome. • Nonresponse when some individual(s) selected for the sample, can’t be contacted or refuse to respond to the survey. Example: Phone survey – don’t answer the phone, or hang up without responding to the survey.
CAUTION: Sample Surveys • Response Bias respondents may lie, especially if asked about illegal or unpopular behavior Example: * Do you smoke marijuana? * Have you ever cheated on a test?
Selecting a Method of Sampling • Regardless of the sampling method chosen, the GOAL should always be to select a sample and conduct a study in such a way as to NOT get BIASED (unfair, untrue) results.
Example 1 Describe how a university can conduct a survey regarding its campus safety. The registrar of the university has determined that the community of the university consists of 6,204 students in residence, 13,304 nonresident students, and 2,401 staff for a total of 21,909 individuals. The president has funds for only 1000 surveys to be given and then analyzed. How should she conduct the survey?
Example 2 Sociologists want to gather data regarding the household income within Smyth County. They have come to the high schools for assistance. Describe a method which would disrupt the fewest classes and still gather the data needed.
Example 3 The manager of Ingles wants to measure the satisfaction of the store’s customers. Design a sampling technique that can be used to obtain a sample of 40 customers.
Example 4 The Independent Organization of Political Activity, IOPA, wants to conduct a survey focusing on the dissatisfaction with the current political parties. Several state-wide businesses have agreed to help. IOPA has come to you for advice. Describe a multi-stage survey strategy that will help them.
Summary Experiments: can detect cause and effect Observational Studies: suggest further work Sampling Methods (Probabilistic) Simple Random Sample Cluster Sample Stratified Random Sample Multi-stage Sample Homework Pages 333-4 & 341-3 problems 5.1-5, 5.7, 5.8, 5.10, 5.13, 5.14 Summary and Homework
Success in Statistics • Success in an introductory statistics course typically requires more common sense than mathematical expertise. • This section is designed to illustrate how common sense is used when we think critically about data and statistics.
Misuses of Statistics Bad Samples • Voluntary Response or Convenience Sampling • These sampling methods almost guarantee NOT to represent the entire population. • For instance, most who volunteer to respond do so because they have a strong opinion about the research topic.
Misuses of Statistics • Bad Samples • Too Small of a Sample • (larger sample sizes give more accurate results). • Misleading Graphs
To correctly interpret a graph, we should analyze the numericalinformation given in the graph instead of being mislead by its general shape.
Misuses of Statistics • Bad Samples • Small Samples • Misleading Graphs • Pictographs
Double the length, width, and height of a cube, and the volume increases by a factor of eight Figure 1-2