1 / 72

BUS7010 Applied Business Statistics

Week 5 Dr. Jenne Meyer. BUS7010 Applied Business Statistics. QNT/561 . Discuss syllabus Groundrules Introductions. Key Learnings. Descriptive Statistics and Probability Distributions Research and Sampling Designs Research Methods and Business Decisions Data Collection Data Analysis

taji
Télécharger la présentation

BUS7010 Applied Business Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Week 5 Dr. Jenne Meyer BUS7010Applied Business Statistics

  2. QNT/561 • Discuss syllabus • Groundrules • Introductions

  3. Key Learnings • Descriptive Statistics and Probability Distributions • Research and Sampling Designs • Research Methods and Business Decisions • Data Collection • Data Analysis • Correlation, Linear Regression, and Multiple Regression Analysis

  4. What is research?

  5. What is research? Research is… • systematic, controlled, empirical, and critical investigation of hypothetical propositions about the presumed relations among phenomenon. (University of Phoenix (Ed.). (2001). Statistics and research methods for managerialdecisions [University of Phoenix Custom Edition e-text]. Cincinnati, OH). • systematic, controlled, empirical, and critical investigation of phenomenon of interest to decision makers. (book definition). • systematic process of collecting and analyzing data or information in order to increase our understanding of the phenomena about which we are concerned or interested. (Leedy and Ormrod, 2001)

  6. What is research? Research is… a systematic, controlled, empirical, and critical investigation of hypothetical propositions about presumed relations among phenomenon. • What happened • How it happened • Why it happened (give meaning)

  7. Why is the study of research useful to you? • Business research is the primary means of gathering data for decision making • Understanding of the research process can lead to better decisions • Research helps make better decisions • Understanding the process can help you ask the right questions

  8. Why is the study of research useful to you? • Identify and effectively solve minor problems in the work setting. • Know how to discriminate good from bad research. • Appreciate and be constantly aware of the multiple influences and multiple effects of factors impinging on a situation. • Take calculated risks in decision making, knowing full well the probabilities associated with the different possible outcomes. • Prevent possible vested interests from exercising their influence in a situation. • Relate to hired researchers and consultants more effectively. • Combine experience with scientific knowledge while making decisions.

  9. Terminology • Symbols  (Uppercase Sigma) = Summation  (Mu) = Population mean  (Lowercase Sigma) = Standard deviation  (Pi) = Probability of success in a binomial trial  (Epsilon) = Maximum allowable error 2 (Chi Square) = Nonparametric hypothesis test ! = Factorial H0 = Null hypothesis H1 = Alternate hypothesis

  10. Measure of Central Tendency • A single value that summarizes a set of data. It locates the center of the values • Arithmetic mean • Weighted mean • Median • Mode • Geometric mean

  11. Descriptive Statistics For Raw Data Measures of Central Tendency If denote a sample of n observations, then the mean of the sample is called "x-bar" and is denoted by: The mean of a population is denoted by the Greek letter m.

  12. Properties of arithmetic mean • Every set of interval data has a mean • All values are included • Mean is unique - only one • Useful to compare two or more populations • Sum of the deviations of each value from the mean will always be zero Disadvantage of arithmetic mean • Because the Mean is sensitive to extreme values, it may not always be a good representation of the data.. • Can’t use for open-ended (range) data

  13. Descriptive Statistics For Raw Data Measures of Central Tendency • Example of “skewed” Mean: • Consider the annual incomes of five families in a neighborhood: $12K $12K $12K $13K $100K • The Mean income in this case: $29.8K • In this case, the Mean is “positively skewed” toward the higher value outlier, and the Mean does not appear to best represent the income of this neighborhood • What we need in this case is a measurement that is less sensitive to large values…..we can consider using the Median... >>

  14. Median • The midpoint of the values (exactly half are below, half are above) • If the number of observations is odd, the median is the “middle observation” • If the number of observations is even, the median is the mean or average of the two middle observations • Used when the mean is not representative due to high value outliers • Unique number • Not affected by extremely large or small values • Can be used with open-ended range values • Can be used for several measurement types

  15. Median of grouped data

  16. Median • Using Previous Examples: • Five Incomes: $12K $12K $12K $13K $100K • Median is: $12K (better representation of neighborhood) • (# of observations is odd, take the middle value = $12K)

  17. Mode • The value that appears most frequently • Five Incomes Example: $12K $12K $12K $13K $100K • Mode is: $12K • Can be used fir any measurement type • Not affected by extremely large or small values • Sometimes it doesn’t exist • Sometimes it represents more than one value

  18. Different Central Measures Can Give Different Impressions • Consider Previous Example: Neighborhood Income • Mean income: $29.8K • Median income: $12K • Modal income: $12K • If you were trying to promote that this is an affluent neighborhood, you might prefer to report the mean income. • If you were trying to argue against a tax increase, you might argue that income is too low to afford a tax increase and report the median and/or the mode. • Note: 3 different measures, each valid and informative in their own way, like all statistics, have potential to inform or dis-inform!

  19. Skewness – Mean, Median, Mode

  20. Measures of Dispersion • Range • Mean deviation • Variance • Standard deviation • Range = highest value – lowest value Mean deviation – the arithmetic mean of the absolute values of the deviations from the mean • The # deviates of average x amount from the mean Variance – the arithmetic mean of the squared deviations from the mean • Compare the dispersion of two or more sets of data Standard deviation – the square root of the variance • represents the spread or variability of the data, the average range from the center point

  21. Range • simplest measure of variability or spread • Range = Max value – Min value • Can give a misleading picture of the actual pattern of variation. Two distributions could have the same range but different patterns of variation. • Is sensitive to extreme data values

  22. Variation • Population variation =varp(…) • Sample variation =var(…)

  23. Standard Deviation • Population variation =stdevp(…) • Sample variation =stdev(…)

  24. Sample Standard Deviation • Sample standard deviation is most common use of statistics

  25. Standard Deviation Example: Numbers Mean Standard Deviation 100,100,100,100,100,100 100 0 90, 90, 100, 110, 110 100 10 Computing the standard deviation: • find the mean (100) • find the deviation/variance of each value form the mean (-10, -10, 0, 10, 10) • square the deviations/variances (100, 100, 0, 100, 100) • sum the squared deviations (100+100+0+100+100 = 400) • divide the sum by the # of values minus 1 (# of values = 5 – 1 = 4, 400/4 = 100) • take the square root of the variance (10) (Will be important in research when you are trying to determine the range of information.)

  26. Coefficient of Variation To compare dispersion in data sets with dissimilar units of measurement (e.g., kilograms and ounces) or dissimilar means (e.g., home prices in two different cities) we define the coefficient of variation (CV), which is a unit-free measure of dispersion:

  27. Coefficient of Variation • Two Investments A & B • Which should I pick? • Choices:

  28. Frequency curves • Normal distribution

  29. Central Limit TheoremChebyshev’s Theorem • If all samples of a particular size are selected from any population, the sampling distribution of the sample mean is approximately a normal distribution. This approximation improves with larger samples. (the larger the sample, the more it appears to be a normal standard distribution)

  30. Central Limit TheoremChebyshev’s Theorem

  31. Central Limit TheoremChebyshev’s Theorem

  32. Probability • The “chance” or “likelihood” of something happening • a value between zero and one • zero= “cannot happen”; one= “sure to happen” • expressed as a decimal or fraction Increasing Likelihood of Occurrence 0 1 .5 Probability: The occurrence of the event is just as likely as it is unlikely.

  33. Probability • Discrete Probability (discrete random variables): • fixed number of clearly separated outcomes • examples: rolling a die (6 outcomes); coin flip (2 outcomes) • Binomial Probability • Continuous Probability (continuous random variables): • infinite number of outcomes within a certain range • example: life expectancies • Workshop 4: find probabilities under bell shaped curve

  34. Example of a Discrete Probability Distribution Probabilities are individual, singular, and unique values; number of outcomes are limited; graphed as bars or rectangles .50 .40 .30 Probability P(x) .20 Not a smooth curve…unless sample size gets large... .10 0 1 2 3 4 Number of Cars Sold on a Saturday, x

  35. Example of a Continuous Distribution: The “Standard” Normal Curve • Probabilities: • are the area under the standard normal curve • can be an infinite number of values within a certain range • “Z” is a calculated value, indicating the number of standard deviations from the mean.

  36. Probability Definitions • Experiment: is a process involving chance or probability that leads to results called outcomes. • Outcome: is the result of a single trial of an experiment. • Event: is one or more outcomes of an experiment. • Sample space: the set of all possible outcomes from an experiment.

  37. Probability Definitions • Independent Event: if the probability of one event is not affected or changed by another • Example: Sampling With Replacement - taking random samples from a population, then replacing the random sample before taking another. As a result, each random sample is not affected by another. The population remains with all data intact. • Dependent Event: if the probability of one event IS affected or changed by another • Example: Sampling Without Replacement - take a random sample from a population, then do not replace the sample before taking another. As a result, each sample taken this way will affect each other. Each removed sample changes the characteristics of the population. • Trial: the act of testing something

  38. 5-4 Approaches to Probability • Classical probability - • the outcomes of an experiment are equally likely. • Using this classical viewpoint,

  39. 5-5 EXAMPLE: Classical Probability Experiment: A spinner has 4 equal sectors colored yellow, blue, green, and red. After spinning the spinner, what is the probability of landing on each color? Outcomes:The possible outcomes of this experiment are yellow, blue, green, and red Probabilities: P(yellow) = number of ways to land on yellow = 1 total number of colors 4 P(blue) = number of ways to land on blue = 1 total number of colors 4 P(green) = number of ways to land on green = 1 total number of colors 4 P(red) = number of ways to land on red = 1 total number of colors 4

  40. 7 Chapter Continuous Distributions

  41. Continuous Variables • Discrete Variable – each value of X has its own probability P(X). • Continuous Variable – events are intervals and probabilities are areas underneath smooth curves. A single point has no probability.

  42. Describing a Continuous Distribution Probability Density Function (PDF) – For a continuous random variable, the PDF is an equation that shows the height of the curve f(x) at each possible value of Xover the range of X. Normal PDF

  43. Normal PDF Describing a Continuous Distribution Continuous PDF’s: • Denoted f(x) • Total area under curve = 1 • Mean, variance and shape depend onthe PDF parameters • Reveals the shape of the distribution

  44. Describing a Continuous Distribution Probabilities as Areas Continuous probability functions are smooth curves. • Unlike discrete distributions, the area at any single point = 0. • The entire area under any PDF must be 1. • Mean is the balancepoint of the distribution.

  45. Normal Distribution Normal PDF f(x) reaches a maximum at m and has points of inflection at m+s Bell-shaped curve

  46. x – ms z = Standard Normal Distribution Since for every value of m and s, there is a different normal distribution, we transform a normal random variable to a standard normal distribution with m = 0 and s = 1 using the formula: Denoted N(0,1)

  47. Standard Normal Distribution A common scale from -3 to +3 is used. Entire area under the curve is unity. The probability of an event P(z1 < Z < z2) is a definite integral of f(z). However, standard normal tables or Excel functions can be used to find the desired probabilities.

  48. Standard Normal Distribution

  49. .5000 Standard Normal Distribution Now find P(Z < 1.96): .5000 - .4750 = .0250

More Related