1 / 48

Statistical Analysis Graphical Techniques

PROBABILITY AND STATISTICS FOR SCIENTISTS AND ENGINEERS. Statistical Analysis Graphical Techniques. Graphical Techniques. Time Series Graph or Run Chart Box Plot Histogram and Relative Frequency Histogram Probability Plotting. Time Series Graph or Runs Chart.

johnna
Télécharger la présentation

Statistical Analysis Graphical Techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PROBABILITY AND STATISTICS FOR SCIENTISTS AND ENGINEERS Statistical Analysis Graphical Techniques

  2. Graphical Techniques Time Series Graph or Run Chart Box Plot Histogram and Relative Frequency Histogram Probability Plotting

  3. Time Series Graph or Runs Chart A plot of the data set x1, x2, …, xn in the order in which the data were obtained Used to detect trends or patterns in the data over time Provides a basis for using a set of data as a random sample

  4. Box Plot A pictorial summary used to describe the most prominent statistical features of the data set, x1, x2, …, xn, including its: • Center or location • Spread or variability • Extent and nature of any deviation from symmetry • Identification of ‘outliers’

  5. Box Plot - Continued Shows only certain statistics rather than all the data, namely - mean - median - quartiles - smallest and greatest values in the sample Immediate visuals of a box plot are : • Center • Spread • overall range of the data

  6. Box Plot - Example Given the following random sample of size 25: 38, 10, 60, 90, 88, 96, 1, 41, 86, 14, 25, 5, 16, 22, 29, 34, 55, 36, 37, 36, 91, 47, 43, 30, 98 Arranged in order from least to greatest: 1, 5, 10, 14, 16, 22, 25, 29, 30, 34, 36, 36, 37, 38, 41, 43, 47, 55, 60, 86, 88, 90, 91, 96, 98

  7. Box Plot – Example Continued First, find the median, the value exactly in the middle of an ordered set of numbers. The median is 37 Next, we consider only the values to the left of the median: 1, 5, 10, 14, 16, 22, 25, 29, 30, 34, 36, 36 We now find the median of this set of numbers. The median for this group is (22 + 25)/2 = 23.5, which is the lower quartile.

  8. Box Plot - Example Continued Now consider the values to the right of the median. 38, 41, 43, 47, 55, 60, 86, 88, 90, 91, 96, 98 The median for this set is (60 + 86)/2 = 73, which is the upper quartile. We are now ready to find the interquartile range (IQR), which is the difference between the upper and lower quartiles, 73 - 23.5 = 49.5 49.5 is the interquartile range

  9. lower extreme lower quartile median upper extreme mean 10 20 30 40 50 60 70 80 90 100 0 Box Plot - Example Continued Lower quartile = 23.5 Mean = 45.1 Median = 37 Upper quartile = 73 Interquartile range = 49.5 upper quartile

  10. Histogram Definition A graph of the observed frequencies in the data set, x1, x2, …, xn versus data magnitude to visually indicate its statistical properties, including : - shape - location or central tendency - scatter or variability

  11. Guidelines for Constructing Histograms – Discrete Data If the data x1, x2, …, xn are from a discrete random variable with possible values y1, y2, …, yk count the number of occurrences of each value of y and associate the frequency fi with yi, for i = 1, …, k, Note that

  12. Histograms - Example Construct a frequency and relative frequency histogram for the “number of cars per household” data.

  13. Histograms – Example Continued 25 20 frequency 15 10 5 2 3 0 1 4 5 Number of cars per household

  14. Histograms – Example Continued 0.5 0.45 0.4 0.35 Relative frequency 0.3 0.25 0. 2 0. 15 0. 1 0.05 0 0 1 2 3 4 5 Number of cars per household

  15. Guidelines for Constructing Histograms – Continuous Data If the data x1, x2, …, xn are from a continuous random variable • select the number of intervals or cells, r, to be a number between 3 and 20, as an initial value use r = (n)1/2, where n is the number of observations • establish r intervals of equal width, starting just below the smallest value of x • count the number of values of x within each interval to btain the frequency associated with each interval construct graph by plotting (fi, i) for i = 1, 2, …, k

  16. Histogram – Continuous Data - Example To illustrate the construction of a relative frequency distribution, consider the following data which represent the lives of 40 car batteries of a given type recorded to the nearest tenth of a year.The batteries were guaranteed to last 3 years.

  17. Histogram – Continuous Data – Example Continued For this example, using the guidelines for constructing a histogram, the number of classes selected is 7 with a class width of 0.5. The frequency and relative frequency distribution for the data are shown in the following table.

  18. Histogram and Relative Frequency The following diagram is a relative frequency histogram of the battery lives with an approximate estimate of the probability density function superimposed.

  19. Probability Plotting Concept Data from a random sample are plotted on special graph paper designed for a particular distribution - Normal - Lognormal - Weibull - Exponential the goodness of fit of each probability distribution to the sample data is determined by how close a straight line on probability paper fits the data.

  20. Benard’s Approximation of F(x) ., where n is the sample size and i is the sample order number. Benard’s approximation is used to estimate median ranks. Median ranks represent the 50% confidence level (“best estimate”) for the true value of F(x), based on the sample size and the order number (first, second, etc.) of the data. Tables of median ranks can be found in many statistics books.

  21. Probability Plotting Procedure Step 1: Obtain special graph paper, known as probability paper, designed for the distribution under examination. Weibull, Lognormal and Normal paper are available at: http://www.weibull.com/GPaper/index.htm Step 2: Rank values from the random sample, X1, X2, … , Xn from smallest to largest in magnitude i.e., Y1< Y2< ... < Yn where y1=min(x1, x2, … , xn) , etc

  22. Probability Plotting General Procedure - Continued Step 3: Plot the xi’s on the selected probability paper versus i.e, plot the ordered pairs (x1 , F(x1)) , (x2 , F(x2)) , …. , (xn , F(xn)) • Step 4: If a straight line appears to fit the data, draw a line on the graph, ‘by eye’. • Step 5: assess goodness of the straight line fit to the data

  23. Probability Plotting - Example The points on Weibull probability paper fall very close To the “best fit” straight line, indicating that the Weibull distribution provides a good fit to the data. Continue the process for the normal, lognormal and exponential distribution. Then select the probability distribution that provides the best fit to the sample data as the likely population for this sample.

  24. Weibull Probability Paper If the cumulative probability distribution function isWe now need to linearize this function into the form y = b + ay

  25. Weibull Probability Plotting Paper which is a linear equation with a slope of and an intercept of . Now the x- and y-axes of the Weibull probability plotting paper can be constructed. The x-axis is simply logarithmic, since x = ln(t) and

  26. Weibull Probability Plotting Paper cumulative probabilityF(t) (in %) t

  27. Probability Plotting - Example To illustrate the process let 30, 10, 80, 50, 20, and 40 be a random sample of size n = 6. Arrange the sample data in numerical order: 10, 20, 30, 40, 50, 80

  28. Probability Plotting – Example Continued Based on Benard’s approximation, we can now calculate F(x) for each observed value of x. For example, for x2=20, ^

  29. Probability Plotting – Example Continued In summary, the probability plotting points are: Now that we have y-coordinate values to go with the x- coordinate sample values so we can plot the points on probability paper for each probability distribution of interest.

  30. Probability Plotting - Example The process is illustrated using Weibull Probability Paper. Plot the (Xi , F(xi)) points for i=1,2,3,4,5,6 as follows: ^ ^ F(x)(in %)

  31. Probability Plotting – Example Continued Determine a “best fit” straight line to the plotted points as follows: The line represents the estimated relationship between x and F(x). ^ F(x)(in %)

  32. Probability Paper - Normal

  33. Probability Paper - Lognormal

  34. Probability Paper - Exponential

  35. Probability Plotting – Other Examples

  36. Probability Plotting – Other Examples Continued

  37. Example - 40 Specimens 40 specimens are cut from a plate for tensile tests. The tensile tests were made, resulting in Tensile Strength, x, as follows: Perform a statistical analysis of the tensile strength data.

  38. 40 Specimens - Time Series plot By visual inspection of the time series plot, there seems to be no trend. Therefore, the sample appears to be a random sample. x Order in data were obtained

  39. 40 Specimens - Descriptive Statistics r

  40. 40 Specimens - Histogram Using the histogram feature of MS Excel the following data was calculated: Histogram of Tensile Strength

  41. lower extreme median upper quartile lower quartile upper extreme mean 45 50 55 60 65 40 40 Specimens – Box Plot lower quartile = 49.45 mean = 52.6 median = 53.03 upper quartile = 55.3 interquartile range = 5.86

  42. 40 Specimens – Normal Probability Plot (by using Minitab) ^ F(X) F(X) Source : http://www.minitab.com/en-US/default.aspx

  43. 40 Specimens – Lognormal Probability Plot ^ F(X) F(X)

  44. 40 Specimens – Weibull Probability Plot ^ F(X) F(X)

  45. ^ F(x) ^ f(x) 40 Specimens The tensile strength Probability Distribution can be estimated by

  46. Summary Conclusions

  47. Summary Conclusions Continued Symmetry is indicated by the sample data Kurtosis of 2.5 is relatively close to that of the normal Normal Probability Plot provides a good fit to the sample data. Hence, it is concluded that the data represents a random sample from a Normal Distribution.

  48. Plot the Normal Probability Using Matlab • 1. Open matlab • 2. type: x=[ • 3. copy your sample data in the bracket (you can copy the data straight from excel or type it) • 4. close the bracket ] and hit enter • 5. type h=normplot(x) • 6. hit enter • 7. a new window with the plot will appear (figure 1) Sara Collins

More Related