340 likes | 472 Vues
Welcome to BUAD 310. Instructor: Kam Hamidieh Lecture 17, Wednesday March 26, 2014. Agenda & Announcement. Today: Chapter 18 Reading: All of Chapter 18 Note: Homework 4 is due TODAY by 5 PM. NO EXTENSIONS will be given.
 
                
                E N D
Welcome to BUAD 310 Instructor: Kam Hamidieh Lecture 17, Wednesday March 26, 2014
Agenda & Announcement • Today: • Chapter 18 • Reading: All of Chapter 18 • Note:Homework 4 is due TODAY by 5 PM. NO EXTENSIONS will be given. • Homework 5 is up and it is due April 2, 5 PM. I’ve also posted a YouTube video and an Excel sheet which should help you a lot. • Exam II is on Wednesday April 16th. We’ll talk about it more on Monday. BUAD 310 - Kam Hamidieh
From Last Time • Two sample t-test with Ha: μ1 – μ2 (> or < or ≠ ) D0 • (1-α)100% Confidence interval for μ1 – μ2 : BUAD 310 - Kam Hamidieh
From Last Time • Inference for dwith paired data: Same as one sample t test for the differences. • When dealing with mean (or means), a two-sided test at level α can be carried out directly from a confidence interval at (1- α)100%: H0: μ = μ0 vs. Ha: μ ≠ μ0Reject H0 at α level ↔ μ0 not in (1- α)100% CIFail to Reject H0 at α level ↔ μ0 in (1- α)100% CI BUAD 310 - Kam Hamidieh
Solution for In Class Exercise 3 From Last Time You can get a large list of textbooks, say n = 100. Next you would find the price for each textbook from Amazon and BN, and then take the difference in the price. For example, on 3/20/2014, the paperback version of Antifragile: Things That Gain from Disorder by N. Taleb was listed as $10.76 on Amazon but $10.86 at BN. The difference is $0.10 (BN – Amazon). You would collect all these differences and perform a one sample t-test or create a confidence interval for the population mean difference in price. BUAD 310 - Kam Hamidieh
Dealing with Proportions • The population proportions will p1 and p2. • We will have two independent random samples of sizes n1 and n2 from two populations. • The main interest will be estimating the magnitude of differences and/or performing hypothesis testing: p1 - p2 = D0. • Again, do not confuse proportions with means. BUAD 310 - Kam Hamidieh
Two Sample z-Test for Proportions • Goal: H0: p1– p2≤ D0 vs. H0: p1 – p2> D0(Only one form shown) • Estimate of the difference: • Standard error: • Test statistics: • Requirements: Random samples from the populations and sample sizes: BUAD 310 - Kam Hamidieh
Confidence Intervals • We still have the same requirements from the last slide. • The 100(1 – α)% confidence interval for p1- p2 is: BUAD 310 - Kam Hamidieh
Example Liberty Mutual is a large insurer of automobiles. The company relies heavily on statistics as a tool to determine the premiums. For example, Liberty Mutual provides drivers with 0 to 6 years of driving experience a 10% discount if they qualify as good students: http://www.libertymutual.com/auto-insurance/teen-driving/teen-auto-insurance/teen-insurance-discounts A good student is defined as anyone enrolled in high school or college who maintains a GPA of at least 3.0. Does the data below support Liberty Mutual’s decision to provide a discount to good students? BUAD 310 - Kam Hamidieh
In Class Exercise 1 Data collected recently indicates that 23% of women want to buy Google Android phones. Among men, 33% want to buy the Google Android phone. The survey contacted 240 women and 265 men at random. • Conduct a hypothesis test to see if there is a difference between the proportion of men and women interested in buying an Android phone. • Do the same as #1 using a 95% confidence interval. • Do you think there could be some potential variables (lurking!) that might explain any differences? BUAD 310 - Kam Hamidieh
Chi-Squared Tests • Chi-Square tests are frequently used for drawing inference from categorical data. • Types of Chi-Square tests: • Test of goodness-of-fit • Test of independence • The math and the calculation of the test statistic is the same for all 2 types. • Why distinguish? The conclusions based on the results of the tests are different. BUAD 310 - Kam Hamidieh
Tests of Goodness of Fit • This is the test to use if one wants to decide if an observed distribution of frequencies for a categorical variable is incompatible with some preconceived or hypothesized distribution. • For example, we may wish to determine whether or not a sample of observed values of some random variable is compatible with the hypothesis that it was drawn from a population of values that is uniformly distributed (equally likely.) BUAD 310 - Kam Hamidieh
Example Microwave oven wholesaler wishes to compare consumer preferences in Milwaukee with the historical market shares in Cleveland. If the consumer preferences in Milwaukee are substantially different, the wholesaler will consider changing its policies for stocking ovens. The wholesaler examines a random sample of 400 Milwaukee consumers Are these the same statistically? BUAD 310 - Kam Hamidieh
Define Null and Alternative • Define parameters of interest. For i = 1,2,3, and 4: pi= proportion who prefer brand i in Milwaukee • The competing hypotheses:H0: p1 = 0.20, p2= 0.35, p3= 0.30, p4= 0.15(The preference distribution for Milwaukee is the same as Cleveland’s.)Ha: At least one of the pi’s is different/H0 is not true.(The preference distribution for Milwaukee is notthe same as Cleveland’s.) BUAD 310 - Kam Hamidieh
Approach Assume null is true: Milwaukee’s preferences are the same as Cleveland. If null were true, what counts would I have seen? Now compared the “observed” counts with “expected” counts. BUAD 310 - Kam Hamidieh
The Test Statistic Test Statistic: Find the p-value using the chi-square distribution (next slide). Reject H0 at significance level a if p-value < a This has a new distribution under null called the chi-squared distribution with df= k – 1, where k = number of groups BUAD 310 - Kam Hamidieh
P-Value • After you have computed the test statistic use chi-square distribution with (k-1) degrees of freedom to find the P-value in the right tail. • Note: large values of the test statistic provide evidence against H0 BUAD 310 - Kam Hamidieh
About Chi-Squared Distribution • Skewed to the right distributions. • Minimum value is 0. • Indexed by the degrees of freedom BUAD 310 - Kam Hamidieh
Back to Our Example H0: p1 = 0.20, p2 = 0.35, p3 = 0.30, p4 = 0.15 Ha: H0 is not true. Under Null, has chi-squared with df = 4 – 1 = 3 BUAD 310 - Kam Hamidieh
Actual Values Using software: P-Value = P(X2 > 8.78) ≈ 0.032 BUAD 310 - Kam Hamidieh
Conclusion • We reject the null hypothesis; we have a statistically significant result. • We have sufficient evidence provided to us by our data that the preference distribution for Milwaukee is not the same as Cleveland’s. BUAD 310 - Kam Hamidieh
Requirements • We have a random sample representing the population. • The test generally works well when all the expected counts are at least 5. BUAD 310 - Kam Hamidieh
Are YOU a good random number generator? (Time Permitting) • Last semester, I asked my students to pick a number between 1 and 10 (from 1,2,3,…,9,10) at random. • We’ll assume the students represent the general student population. Here’s what we got, (n=112): BUAD 310 - Kam Hamidieh
Are YOU a good random number generator? • If your are picking number at random, then we should see approximately uniform bars: 10% pick 1, 10% pick 2, …, 10% pick 10. • Let’s test them!H0: p1 = 0.1, p2= 0.1, …, p10= 0.1(Students generate numbers at random.)Ha: H0is not true.(Students do not generate numbers at random.) BUAD 310 - Kam Hamidieh
Are YOU a good random number generator? Results: 1 2 3 4 5 6 7 8 9 10 observed 6 10 19 10 6 14 16 17 7 7 Expected 11.2 11.2 11.2 11.2 11.2 11.2 11.2 11.2 11.2 11.2 Conclusion: Students are not good random number generators! BUAD 310 - Kam Hamidieh
Chi-Squared Test: Independence • Does client satisfaction depend on investment fund type? • Variables: Fund Type vs. Satisfaction • Note: both are categorical variable. BUAD 310 - Kam Hamidieh
Hypotheses • Our hypotheses are:H0: There is no relationship between the row variable (fund type) and the column variable (satisfaction)Ha: There isa relationship between the two variables • Relationship = association = dependence • A bit more technically: X = Fund Type, Y = SatisfactionParameter of interest: P(Y = y | X = x), for all xH0: P(Y = y | X = x) = P(Y = x) for all xHa: P(Y = y | X = x) ≠P(Y = x) for some x • We need a measure of how much the observed data deviates from what we would expect under H0. BUAD 310 - Kam Hamidieh
Test Statistics • Our test statistics is still: • Expected counts: • Under H0, X2 has a chi-square distributionwith df = (number of rows – 1) × (number of column – 1) BUAD 310 - Kam Hamidieh
Expected Count Computations Expected counts are in red. Expected Count for Low/Bond = (30 × 20)/100 = 6, Expected Count for Med/Bond = (30 × 40)/100 = 12, … Expected Count for High/TaxDef= (40 × 40)/100 =16 BUAD 310 - Kam Hamidieh
Back to Our Example H0: There is no association between fund type satisfactionHa: There is association between the two variables Under Null, has chi-squared with df = (3-1)(3-1)= 4 BUAD 310 - Kam Hamidieh
Actual Values Using software: P-Value = P(X2 > 46.43) ≈ 0 BUAD 310 - Kam Hamidieh
Conclusion • We reject the null hypothesis; we have a statistically significant result. • We have sufficient evidence provided to us by our data that there is an association (or relationship or dependence) between the fund type and client satisfaction. • Note: Required conditions are the same as the conditions for the goodness of fit tests (see slide 22.) BUAD 310 - Kam Hamidieh
In Class Exercise 2 (Time Permitting) In “good news – bad news” settings, researchers report that 83% of women in the 21-34 age group say they prefer to hear the bad news first. This compares to 50% for the 35-44 group, 53% for the 45-54 group, and 70% for the 55-or-over group. Test whether news preference is independent of age group, assuming that the researchers worked with independent samples, each consisting of 100 women. BUAD 310 - Kam Hamidieh
In Class Exercise 2 (Continued) H0: Preference and age group are independent.Ha: Preference and age group are dependent. Observed versus Expected: Conclusion: Reject the null hypothesis. Our data suggests that age group and preference are dependent. BUAD 310 - Kam Hamidieh