250 likes | 446 Vues
Agenda. Kritik og forsvar Informationer Kan projekt 1 opgaven formuleres bedre? Reserver tid til projekt 2 Introduktion til projekt 2 Opsamling fra sidst Sampling distribution (Estimation). Projekt 2. Diskuter relevansen af jeres BV-spm. Lav en deskriptiv analyse af jeres BV-spm.
E N D
Agenda • Kritik og forsvar • Informationer • Kan projekt 1 opgaven formuleres bedre? • Reserver tid til projekt 2 • Introduktion til projekt 2 • Opsamling fra sidst • Sampling distribution • (Estimation)
Projekt 2 • Diskuter relevansen af jeres BV-spm. • Lav en deskriptiv analyseaf jeres BV-spm. • Perspektiver jeres kommunes resultater i lyset i jeres svar på spørgsmål b. • Beregn et 95%-konfidensinterval ... • Undersøg vhja. en signifikanstest ...
BV-spørgsmål Hjemmesiden giver alt i alt et positivt helhedsindtryk. Kvaliteten af hjemmesidens indhold er høj. Det er nemt at finde rundt på hjemmesiden. Teksterne på hjemmesiden er skrevet i et klart og letforståeligt sprog. Jeg fandt let det, jeg ledte efter. Jeg oplever, at hjemmesiden er hurtig og virker, som den skal uden at lave fejl. Hjemmesiden indeholder det, jeg har behov for. Hjemmesiden giver mig fordele, som jeg ikke kan opnå på andre måder (f.eks. via telefon, brev/mail eller personlig kontakt). Kommuner Greve Halsnæs Horsens Projekt 2
Learning Objectives • Normal Distribution • Using Excel to find probabilities • 68 – 95 – 99,7 Rule for normal distributions • Z-Scores and the Standard Normal Distribution • Using Excel to find probabilities for z-scores
Eksempel på brug af normalfordelingen De besøgende på en hjemmeside bruger i gns. 300 sekunder på forsiden, før de klikker videre til en underside. Besøgstiden er normalfordelt med en standardafvigelse på 50 sekunder. Hvad er sandsynligheden for at tilfældig besøgende højest bruger 365 sekunder på forsiden? X = 365, μ = 300, σ = 50, Hvad er P(365<X)? Svaret er 0,903199451.
68 – 95 – 99,7% Rule • 68% of the observations fall within one standard deviation of the mean • 95% of the observations fall within two standard deviations of the mean • 99.7% of the observations fall within three standard deviations of the mean
z-score X angiver tiden før der klikkes videre fra forsiden til en underside. X er normalfordelt med μ = 300 og σ = 50. X’s z-score angiver hvor mange (antal) standardafvigelser, X ligger fra μ Hvad er sandsynligheden for at en tilfældig besøgende bruger mindre end 240 sekunder på forsiden? z = (X – μ) / σ = (240 – 300) / 50 = -60 / 50 = -1,2 P(z<-1,2) = 0,1151.
Learning Objectives • Statistic vs. Parameter • Sampling Distributions • Mean and Standard Deviation of the Sampling Distribution of a Proportion • Standard Error • Sampling Distribution Example • Population, Data, and Sampling Distributions
Population Sample Learning Objective 1:Statistic and Parameter • A statistic is a numerical summary of sample data such as a sample proportion or sample mean • A parameter is a numerical summary of a population such as a population proportion or population mean. • In practice, we seldom know the values of parameters. • Parameters are estimated using sample data. • We use statistics to estimate parameters. • Hvad er følgende? • μ • s2 • σ
Learning Objective 2:Sampling Distributions Example: • Prior to counting the votes, the proportion in favor of Mr. Barack Obama was an unknown parameter. • An exit poll of 3160 voters reported that the sample proportion in favor of a recall was 0.54. • If a different random sample of about 3000 voters were selected, a different sample proportion would occur. The sampling distribution of the sample proportion shows all possible values and the probabilities for those values.
Learning Objective 2:Sampling Distributions • The sampling distribution of a statistic is the probability distribution that specifies probabilities for the possible values the statistic can take. • Sampling distributions describe the variability that occurs from study to study using statistics to estimate population parameters
Learning Objective 3:Mean and SD of the Sampling Distribution of a Proportion • For a random sample of size n from a population with proportion p of outcomes in a particular category, the sampling distribution of the proportion of the sample in that category has
Learning Objective 4:The Standard Error • To distinguish the standard deviation of a sampling distribution from the standard deviation of an ordinary probability distribution, we refer to it as a standard error. • Example: If the population proportion supporting the reelection of Schwarzenegger was 0.50, would it have been unlikely to observe the exit-poll sample proportion of 0.565?
Learning Objective 5:Example: 2006 California • Given that the exit poll had 2.705 people and assuming 50% support the reelection of Schwarzenegger, • Find the estimate of the population proportion and the standard error:
Learning Objective 5:Example: 2006 California Election • The sample proportion of 0.565 is more than six standard errors from the expected value of 0.50. • The sample proportion of 0.565 voting for reelection of Schwarzenegger would be very unlikely if the population proportion were p = 0.50 or p < 0.50
Learning Objective 6 Empiriske fordelinger Population distribution, N. Populationens parametre og fordeling er ukendte. Vi udtager en stikprøve fra populationen for at få viden om populationen, typisk parametrene μ og σ. Sample distribution, n. Stikprøven er en delmængde af N. Den består af data / observationer, u1, u2,..,un. Stikprøven kan beskrives grafisk og numerisk, f.eks. ved hjælp af gns. ū og std.afv. s. Jo større stikprøven er, des mere ligner den populationen Teoretiske fordelinger (fx binomial- eller normalfordelingen) Sandsynlighedsfordelinger viser sandsynligheden for at en variabel har ét bestemt udfald (sandsynligheden er udfaldets ”andel” i det lange løb). ”Standard deviation” er et spredningsmål. En ”samling distribution” er sandsynlighedsfordelingen for et statistisk mål, (typisk gennemsnit og standardafvigelse). Den bruges til at finde de sandsynlige værdier af det statistiske mål i populationen. ”Standard deviation” kaldes ”Standard error”
Learning Objectives • The Sampling Distribution of the Sample Mean • Effect of n on the Standard Error • Central Limit Theorem (CLT)
Learning Objective 1:The Sampling Distribution of the Sample Mean • The sample mean, x, is a random variable. • The sample mean varies from sample to sample. • By contrast, the population mean, µ, is a single fixed number.
Learning Objective 1:The Sampling Distribution of the Sample Mean • For a random sample of size n from a population having mean µ and standard deviation σ, the sampling distribution of the sample mean has: • Center described by the mean µ (the same as the mean of the population). • Spread described by the standard error, which equals the population standard deviation divided by the square root of the sample size: • standard error of
Learning Objective 2:Effect of n on the Standard Error • Knowing how to find a standard error gives us a mechanism for understanding how much variability to expect in sample statistics “just by chance.” • The standard error of the sample mean = • As the sample size n increases, the standard error decreases. • With larger samples, the sample mean is more likely to fall closer to the population mean.
Learning Objective 3:Central Limit Theorem (CLT) • For random sampling with a large sample size n, the sampling distribution of the sample mean is approximately a normal distribution. • This result applies no matter what the shape of the probability distribution from which the samples are taken.
Learning Objective 3:CLT: How Large a Sample? • The sampling distribution of the sample mean takes more of a bell shape as the random sample size n increases. • The more skewed the population distribution, the larger n must be before the shape of the sampling distribution is close to normal. • In practice, the sampling distribution is usually close to normal when the sample size n is at least about 30. • If the population distribution is approximately normal, then the sampling distribution is approximately normal for all sample sizes.
Learning Objective 3:CLT: Making Inferences • For large n, the sampling distribution is approximately normal even if the population distribution is not. • This enables us to make inferences about population means regardless of the shape of the population distribution.