Analyzing Sample Distributions through USSCSI: Key Characteristics and Visual Methods
This discussion outlines how to analyze sample distributions effectively using the USSCSI method. Key areas of focus include initial visual interpretation, symmetry, center, spread, and identifying unusual features such as gaps and outliers. Various graphical representations, including boxplots, histograms, and dot plots, are recommended for displaying data characteristics. Each type of plot has its own suitability and limitations, particularly concerning data shape and outlier identification. The discussion concludes with an assignment to explore standard deviation.
Analyzing Sample Distributions through USSCSI: Key Characteristics and Visual Methods
E N D
Presentation Transcript
Discussion of a sample - using USSCSIWhen describing sample distribution use these headings:- Initial Visual Interpretation- Shape (Symmetry, Skew, Tail)- Centre- Spread- Shift/Overlap- Unusual Features (gaps, outliers)
When describing dot plots: discuss symmetry, skew, outliers, gaps, clusters • Also consider possible reasons or causes of these features (in the context) • Symmetry - Is the data symmetrical either side of the median? Or is one side more spread out? • Outliers - Decide if they are outliers or not? Will they impact the data? • Gaps - A small gap is of no concern, but a large gap may reflect groups or underlying variables • Clusters - A bunching of data in one region
Graphs for displaying distributions • Boxplot Suitable for: • All types of numerical data • Displaying key features of distributions • Showing skew • Displaying potential outliers (placed as points beyond the whiskers. • Cautions – lacks details and does not show the shape of the distribution well.
Histogram • Suitable for: • Continuous data • Grouped data • Caution: • If bars are too narrow low frequencies hide the shape • If bars are too wide the shape is over simplified
Dot Plots • Suitable for: • Discrete data • Repetitive, rounding, continuous data • Displaying individual counts clearly. • Caution - Not useful when frequencies of individual data are low, or when data is unrounded and continuous. In this case grouping the data in a histogram is better.
Go to Page 141 in Texts • Complete Questions 1, 5, 6 and 7. • Homework – what is standard deviation?