230 likes | 321 Vues
Learn how to use hypothesis testing to verify claims about population proportions and make informed statistical decisions. Understand hypothesis setup, sampling distribution, decision-making criteria, and variations in proportion tests.
E N D
HypothesisTestforProportions Chapter9,Section2 StatisticalMethodsII QM3620
TheSample • Supposethatwearenotsointerestedinestimatingthis proportionasweareinverifyingifsomeclaimistrue. • Forexample,management maywanttoknowifmorethan50%ofourcustomersaresatisfied. • Wecould calculateaconfidenceintervalandcompareitto50%,orwecantestthisconcept usingahypothesistest.
TheHypotheses Wewouldwantarandomsampleof customersandenoughevidencefromthatsampleofcustomerstoshowthatthe opposingviewpoint(that50%orlessofthecustomersaresatisfied)isnot reasonable. This sets up our hypotheses…
TheHypotheses(cont’d) Theviewpointthatwearetrying to“prove”(morethan50%aresatisfied),isouralternativehypothesis.Notethat theseclaimsareaboutthepopulation,notthesample. ≤.50 >.50 HO: HA:
SamplingDistribution Ifweassumethatthenullhypothesisistrue,thentheclosesttheproportioncouldbe tothealternativehypothesis(andstillbepartofthenull)isiftheproportionis.50. So…ifthenullistrue,andtheproportioninthepopulationis.50,thenwhat proportionscouldweexpecttoseeinsamples.Well,inlargesamples,the proportionwouldlikelybereallyclosetotheactualproportion,whereasinsmall sample,theproportioncouldbeexpectedtovarymuchmore. ≤.50 HO: n=100 HA: >.50 Forinstance,ifweweretakingsamplesofsize 100,wecoulddeterminethat95%ofthesamples Assume =.50 wouldhaveaproportionofsatisfiedcustomers betweenapproximately40%and60%.So,getting asampleof100inwhichtheproportionofsatisfied customersis55%isnotreallyproofofthe alternativehypothesis…itcouldeasilyoccurifthe nullhypothesisistrue.
MakingaDecision So,whatevidencewouldbeenough?Certainlyonlysampleproportionsthatare above50%wouldqualify.Asampleproportionoflessthan50%couldoccurevenif thepopulationofcustomershasagreaterthan50%satisfactionrate,butitwould neverbeenoughevidencetoconcludetheproportionwasabove50%inthe population.Whatwearelookingforaresampleresultsthatare“improbable”ifthe nullhypothesisistrue.Howimprobable?Thatiswherethealphalevelkicksin.Say alpha=.05. ≤.50 >.50 HO: HA: Ifthesampleresult occurslessthan5%of thetimeifthenull hypothesisistrue,then thatisenoughevidence. Ifthatistooprobable, thenreducethealpha level. Reject .05 Assume =.50
ProportionTestPossibilities Justlikeourhypothesistestsofthepopulationmean,wehavethreealternativesfor thesetupofthetest.Wecantesttoseeifthepopulationproportionisgreaterthan somenumber(Example1),lessthansomenumber(Example2),ordifferencethan somenumber(Example3.) Example1 Example2 Example3 ≥.50 <.50 ≤.50 >.50 H0: HA: H0: HA: =.50 ≠.50 H0: HA: But,rememberthatallofthesetestsareaboutthepopulation,notthesample…and theystillrequirethatyoutakeagoodrandomsampleanddon’ttrytofudgethe results.Statisticsisbuilttohandlethenaturalvariationinsamples,notthedeliberate variationcausedbyincompetence.
RemembertheProcess Specifypopulationparameterofinterest Formulatethenullandalternativehypotheses Specifythedesiredsignificancelevel, Takearandomsampleandcalculaterelevant statistics Calculatep-valuefortestandcompareitto Reachadecisionanddrawaconclusion 1) 2) 3) 4) 5) 6)
TheTestStatistic Asstatedearlier,theamountofvariationinproportionsfromsampletosample canbecalculatedifyouknowthesamplesize(n)andthehypothesized proportioninthepopulation(0).Theformulais: π0(1π0) n σp wherepisthestandarderrorofthesampleproportions(i.e.theamountof variationintheproportionfromsampletosample). Wewillusethisstandarderrortohelpusdetermineifoursamplecouldhave comefromthepopulationwiththehypothesizedproportion(0).Ifwedivide thedifferenceinthesampleproportionandthehypothesizedproportionp-0, bythestandarderror,wegetateststatisticsthatcanbecomparedtothe standardnormaldistributionviatheNORMSDISTfunctioninExcel.Thatwillget usap-value. Theteststatisticis pπ0 π0(1π0) n
Excel’sNORMSDIST–UpperTailed Excel’sNORMSDISTfunctionisacumulative function,whichmeansitwillalwaysaggregatethe areaunderthedistributiontotheleftofsomepoint. ≤.50 >.50 HO: HA: Suppose Thehypothesized proportion(O)is.50. p-value=1-NORMSDIST(TS) Remember,ifthesample proportionisactuallybelow.50in thiscase(alreadyintheHOregion), thenthereisnoreasontogo forwardwiththehypothesistest. Theresultisobvious. pπ0 π0(1π0) n Tofindtheareaintheupper tail,weneedtosubtractthe areatotheleftofthetest statisticfrom1(sincethe totalareaunderthe distributionequals1.) TS
Excel’sNORMSDIST–LowerTailed Excel’sNORMSDISTfunctiontestofanlower- tailedalternativehypothesisistheeasiest.Weare justfindingtheareatotheleftoftheteststatistic. ≥.50 <.50 HO: HA: Suppose Thehypothesized value(O)is.50 p-value=NORMSDIST(TS) Remember,ifthesample proportionisactuallyabove.50in thiscase(alreadyintheHO region),thenthereisnoreasonto goforwardwiththehypothesis test.Theresultisobvious. pπ0 π0(1π0) n TS
Excel’sNORMSDIST–TwoTailed Excel’sNORMSDISTfunctiontestofatwo-tailed alternativehypothesisisnotthathard,butyou havetowatchtheparentheses. Tofindthetotalprobabilityinboth areas,youneedtousetheABS() functionsothatyoualwayshavea positiveteststatistic.Also,besure =.50 .50 tomultiplyby2onlyasthefinal stepinthecalculation. HO: HA: Suppose Thehypothesized value(O)is.50 p-value=(1-NORMSDIST(ABS(TS)))*2 Excel’s Absolute Value function pπ0 π0(1π0) n Enclosethe entirecalculation inparentheses soitcanbe calculatedbefore multiplyingby2. TS Thisgivesus thetotalarea inbothtails.
Summary:Findingthep-valuewithExcel Lowertailtest Example: Twotailedtest Example: Uppertailtest Example: ≥.50 <.50 ≤.50 >.50 H0: HA: H0: HA: =.50 ≠.50 H0: HA: p-value p-value p-value Test Statistic Test Statistic Test Statistic Test Statistic OR =NORMSDIST(TS) =(1-NORMSDIST(ABS(TS)))*2 =1-NORMSDIST(TS) Theabsolutevalueof theteststatisticusing theABS()function Multiplytheareaby2to adjustfor2tails Thevalue ofthetest statistic. Thevalue ofthetest statistic. SinceNORMSDIST() isacumulative function,weneedto subtracttheresult fromone.
The Assumption Wedomakeoneassumption(beyondtherandomsampleassumption)whichcanbe easilychecked.Weneedtomakesurethatwehaveenoughobservationsinour sampleinordertousethenormaldistribution.Thatcanbecheckedwithtwoeasy formulas… HypothesisTestfor ≥5 n n <5 0 0 or n(1-0)<5 Callastatistician and n(1-0)≥5 Thesamplingdistributionofp isnormal…proceedwith testing.
Statistical Terms TheNulland AlternativeHypothesis Thesamethreepossibilities existfortheproportion as for the mean.Wecantesttoseeiftheproportioninthepopulationislessthansomenumber, greaterthan somenumber, ordifferentthansomenumber. Themaindifferentinthenullandalternativehypothesesisthatthenumber isnowapercentage. The AlphaLevel Thealphalevelisthe acceptablechanceofatypeIerror–thechancethatyouwillacceptthealternativehypothesiswhenthenullhypothesisis true.
Statistical Terms The TestStatistic Mostteststatisticshavethesameoverallapproach.Theyareameasureofhowfaravalueisfromahypothesizedvaluein termsofstandarderrors.Inthismodule,wewillbeusingthestandarderrorofthe sampleproportion(aswearetestingtoseeifthesampleproportionisconsistentwiththehypothesizedproportion). Thep-value Ifthep-valueshowssupportforthenullhypothesis, thenitwillbelarge(oratleastlargerthanα).Ifthep-valuesupports thealternativehypothesis,thenitwillbesmall(lessthanorequaltoα). The Assumption Besidesarandomsample,weneedenoughobservationstoensurethatthenormaldistributioncanbeused.
ApplicationTime Let’strythisforreal
Business ApplicationHighlights Readthebusinessapplicationonpage400. First AmericanBank&Trustprovidesautomobileloanstocustomers. Itisimportantthatdocumentationbeprovidedforeachloantopreventfraudand tocomplywithregulations. Internalauditorscheckthecomplianceoffilesbyevaluatingasampleofthe22,500 outstandingloans.Itisnotpossibletoauditeachloanfileduetotimeand resourceconstraints. Nomorethan1%oftheloanscanbeoutofcompliancewiththebank’sstandards. Asampleof600filesistakenand9filesarefoundtobeoutofcompliance.
The Approach Intheproblem,wearetryingtoarriveatsomeconclusionaboutthepopulationof files, butweonlyhaveinformationon600(whichsoundslikealot, butthatmeans thatnearly22,000fileswerenotanalyzed).Wemustdecidewhetherthereis evidencethattheproportionofallfilesthatareoutofcomplianceexceedsthe acceptablelevel.Thusitmaybethecasethattheproportionmaybegreaterthan theacceptablelevelinthesample, butweneedconclusiveevidencethatthesame conditionwouldbetrueifweexaminedallofthe22,500files. Wewillstartwiththeassumptionthatthenullhypothesis(thefilesarewithinthe acceptablecompliancelevel)istrue.Wewilllookattheinformationinthedatato seeifthereisevidencetosupportthealternativehypothesis(thefilesarenotwithin theacceptablecompliancelevel).Thedatawillbetheevidence, butweneedmore thancompellingevidence…weneedconclusiveevidence. Hypothesis: H0: Files out of compliance < or = 1% HA: Files out of compliance > 1%
UsingStatistics TheNulland AlternativeHypothesis Wearegoingtoassumethatthefilesarecompliant(HO:≤0.01)andseeifwecanprove thattheyareoutofcompliance(HA:>0.01).Thenullhypothesisalwayshasthe“=“sign embeddedinit,andweassumethatthenullhypothesisistrueuntilprovenotherwise. The AlphaLevel Theproblemstatesthatthealphalevelshouldbesetto0.02, anunusualsetting. Thep-value Statisticalcalculationswilltellusthelikelihoodthat9outof600fileswouldbefoundtobe noncompliantifthenullhypothesisistrue(thefilesin totalareincompliance).Thatisthe p-value;itisalikelihoodoraprobability.Ifthep-valueissmall(i.e.thesampledatais unusualifthenullhypothesisistrue),thenwewillconcludethatthereismoreevidenceto supportthealternativehypothesis.Ifthep-valueislarge,thentheevidenceisnotunlikely underthenullhypothesis…oratleastnotunlikelyenoughforustorejectitoutright. Howsmallorhowlargeisenoughisdeterminedby, ourrequired“levelofsignificance” whichisourpredetermineddecisionpoint. The Assumptions Theassumptionsaresolelybasedonsamplesize,whichischeckedbyrequiringthat0n≥5 and(1-0)n≥5.Ifthoseconditionshold,thenthenormaldistributioncanbeusedinthe hypothesistesttocalculatethep-value.Sincen=600and0=0.01,then0n=6and(1- 0)n=594,sotheassumptionholds.
BusinessApplicationHighlights Readproblem9-42onpage406. TheInvestmentCompanyInstituteisinterestedinwhetherindividualsfundtheirIRAs usingarolloverfromanemployer-sponsoredretirementplan. Whenanemployeeleavesacompany,theemployeeisnolongerallowedtoparticipate intheemployer-sponsoredretirementplanandmustdisposeofthemoneyinthe accountsomehow. Someindividualstakethemoneyasadispersal,butthattriggersataxeventwhichalso incurspenaltiespayabletotheIRS. AnalternativeistorolloverthemoneyintheaccounttoanIRA,whichpreventsany taxorpenaltyfrombeingpayable.Thisisthewisestapproach. Thequestionthatshouldbeaskedofthisstudyiswhethertheseindividualsevenlefta jobthathadanemployer-sponsoredretirementplan.Iftheydidnot,thenthequestion aboutarolloverismute. Notethatthebasisforthehypothesisvalue,the72%,isalsofromasampleand thereforehassomevariabilityinit.However,itisalsobasedon3500observations,so thevariationwillbeminimal. Thequestionposediswhethertheproportionofindividualsthatrollovertheir employer-sponsoredretirementplansisdifferentinMiamithanthe72%foundinthe countryingeneral. ThisresultmightbeusefultocompaniesthatmarketIRAs.
UsingStatistics TheNullandAlternativeHypothesis WearegoingtoassumethattheproportioninMiamiisthesameastherest ofthecountry(HO:=0.72)andseeifwecanprovethattheproportionis different(HA:0.72). The AlphaLevel Theproblemstatesthatthealphalevelshouldbesetto0.10,arelativelyhigh chanceofatypeIerror. Thep-value Thep-valuewilltellusthelikelihoodthatourresultinthesamplewouldbe foundifthenullhypothesisistrue(theproportionisthesameinMiamiasin therestofthecountry.) The Assumptions Wecanchecktheassumptionbycalculatingandcomparing0nand(1-0)n to5.Ifthecalculationsarebothgreaterthan5,theassumptionshold.Since n=90and0=0.72,then0n=64.8and(1-0)n=25.2.Wecancontinue usingthisapproach.