400 likes | 519 Vues
Elementary Statistics. Fall 2008. About Me…. Where I’m from:. About Me…. My “kids”…. About Me…. My personality…. Webpage. http://web.missouri.edu/~dls6w4 Syllabus Calendar Practice Materials Homework Exam Information. Blackboard. Make sure you have access to Blackboard
E N D
Elementary Statistics Fall 2008
About Me… • Where I’m from:
About Me… • My “kids”…
About Me… • My personality…
Webpage • http://web.missouri.edu/~dls6w4 • Syllabus • Calendar • Practice Materials • Homework • Exam Information
Blackboard • Make sure you have access to Blackboard • You must either: • Activate your stlcc email account • Update Blackboard to different email • Otherwise, you will not receive emails • You are still responsible for all emails sent regardless of receipt • When/if you send me an email, please put “Stats Night” in the subject line • If you do not, I won’t answer it
Homework Homework • “Long and painful” • Absences will not excuse you from completing homework • All will be posted on the webpage • You’ll need to have a strong understanding of the material • Group work… • I will take your top 5 scores • I do not know how many we will have
Exams 4 exams • Final is cumulative • I will drop your lowest exam score of the first three • The final exam counts • You will be allowed a notecard for formulas and a non-programmable calculator
Project • Paper, no minimum page requirement • Do something that interests you • Check webpage for details/deadlines • Failure to complete the paper as required will result in the loss of an additional letter grade
Attendance • Attendance includes being present, but it also includes: • Not disrupting class • Being attentive • Not excessively talking • Not doing anything I deem “annoying” • This will cost you attendance credit • If you come in after roll call, it is your job to notify me in person that day
Point Breakdown • Exams: 60% • Three Midterm exams: 100 points each • Final Exam (cumulative): 100 points • Homework: 30% • Each homework worth fifty points each • I’ll count the top 5 • Project: 10% • Attendance: Loss of 3%
Exam I Material Introductory Material
Some Basics • Descriptive Statistics • Allow us to get a sense of things • Inferential Tools • Allow us to reach some conclusion • Estimation, Hypothesis Testing
Where does data come from? • Experiments • Process generating outcomes • Design is important • Surveys • Closed-end Questions • Open-end Questions • Demographics • Interviews/Observation
Stop and Think • What kinds of things can go wrong with surveys?
What can go wrong? • Potential Problems • Interviewer Bias • Non-response Bias • Selection Bias • Observer Bias • Measurement Error • Validity • Internal – Eliminating useless info • External – Results beyond original
Key Terms • Population • All possible observations • Sample • A portion of the population • Is error (sample) worth the lower cost (population)?
Sampling Techniques • Statistical Sampling – Based on chance • Nonstatistical Sampling – Not on chance • Simple Random Sampling – All possible • Stratified Random Sampling – Into levels • Systematic Random Sampling – Every kth • Cluster Sampling – Break into groups
Types of Data • Quantitative v. Qualitative • Quantitative – Numerical • Qualitative – Categorical • Time-series v. Cross-Section • Time-series – one value, many times • Cross-section – many values, one time
What level are the data? • Nominal – Simplest form, no rank implied • Ordinal – Rank data • Interval – Difference measure, no true zero • Ratio – Consistent, true zero
Describing Data • Frequency Distribution • Reports how often values occur • Classifies observations by class • Relative Frequency • How often one value occurs compared to sample • Usually expressed in percentage • RF = (fi)/(n)
Describing Data • Grouped Frequency Distribution • Classifies data into groups • Groups must be: • Mutually Exclusive • All-Inclusive • Equal-Width • Free of empty classes (if possible)
Describing Data • Grouped Frequency Distribution • How to determine groups • Determine number of groups (2k≥n) • Establish width of classes • Determine boundaries for classes • Count values in each class • Both types can be built into a histogram • Also can construct Cumulative Frequency Distribution and build an ogive
Describing Data • Other methods • Bar Chart • Pie Chart • Stem-and-Leaf Diagram • Line Chart (Time graph) • Scatter Plot • Can see relationship between X and Y • Demand/Supply curves (Economics)
Describing Data • May want to examine two variables • Use Joint Frequency Distribution • How? • Get data containing two responses • Build table • Find joint occurrences • Sum rows and columns for marginal frequencies
Numerical Measures • We’ve done some simple measures • Now let’s actually do some calculations • Before we start: • Parameter-based on population • Statistic-based on sample
Center and Location • Population Mean (μ) • A.k.a. average • For population, sum of deviations=0 • Sample Mean (x-bar) • Based on a selected sample • All means subject to distortion by extrema
Center and Location • Median • Middle value of the data • Odd-numbered sample=find middle • Even-numbered sample=find middle of middle two
Center and Location • Taken together, the mean and median show skewness of data • Median>Mean = Left Skewed • Median<Mean = Right Skewed
Center and Location • Mode • Value occuring most often • Occasionally, a set of data has no mode
Center and Location • Weighted Mean • Same idea as mean, just unequal weights on observations • Percentiles • Describes where a particular value is located in data • i = (p/100)*(n) • If i is integer – average (i, i + 1) • If i is not integer – round up • Quartiles • Dividing the data into four equal parts • “Qua” implies four (quarter, quart, etc.)
Be careful! • These not always useful for qualitative data masquerading as quantitative • Need further assumptions/theory to hold
Measures of Variation • Variation – The “spread” of the data • Range = Maximum – minimum • Sensitive to extrema • Considered weak • Interquartile Range = Third Q – First Q • Softens dependence on extrema
Measures of Variation • Variance (σ2) • Measure of dispersion or spread • Equation… • Shortcut… • Standard Deviation (σ) • √VAR • Sample (s2, s) and Population calculated in similar fashion • Use n-1 instead of N in denominator
Combining μ and σ • Coefficient of Variation (CV) • Relative variation with different means • (σ/μ)*(100%) for population • Replace with sample measures for sample CV • Empirical Rule (with bell-shape) • 68% within μ ± σ • 95% within μ ± 2σ • “All” within μ ± 3σ
Standardizing Values • Allows us to compare different data effectively • Z-value (population) = (x – μ)/σ • X is value of interest • Based on a standard normal distribution • Mean = 0, Variance = 1 • This will be important from now until the end
Probability • The chance that something will happen • Sample Space – all possible events • Event – Element(s) of sample space • Mutually Exclusive • Independence v. Dependence • Ways to determine • Classical • Relative Frequency • Subjective
Probability • Some rules to know… • All probabilities are between 0 and 1 (incl.) • The sum of all probabilities is 1 • Complement Rule • Probability of X = 1 – Probability of all others • Addition Rule • Probability of X or Y = Pr(X) + Pr(Y) – Pr(X and Y) • If events mutually exclusive = Pr(X) + Pr(Y)
Probability • Some simple examples • Probability of tails on fair coin? • Probability of rolling a 1 or 6 on fair die? • Probability of drawing a heart from standard deck?
Probability • Conditional Probability • The probability that one event occurs when you know something else has happened • Pr(X|Y) = Pr(X and Y)/Pr(Y) • If the events are independent, =Pr(X) • Multiplication Rule • Pr(X and Y) = Pr(X)(Pr(Y|X)) • Independent = Pr(X)Pr(Y)