1 / 10

# Multivariate Statistics

Multivariate Statistics. Harry R. Erwin, PhD School of Computing and Technology University of Sunderland. Resources. Everitt , BS, and G Dunn (2001) Applied Multivariate Data Analysis, London:Arnold .

Télécharger la présentation

## Multivariate Statistics

E N D

### Presentation Transcript

1. Multivariate Statistics Harry R. Erwin, PhD School of Computing and Technology University of Sunderland

2. Resources • Everitt, BS, and G Dunn (2001) Applied Multivariate Data Analysis, London:Arnold. • Everitt, BS (2005) An R and S-PLUS® Companion to Multivariate Analysis, London:Springer

3. Introduction • Most statistical data sets are multivariate. • Sometimes it’s useful to study a variable in isolation, but usually you need to examine all the variables to understand the data. • The next few lectures are the core of this module. • We will examine the description, exploration, and analysis of multivariate data.

4. Multivariate Data • Natural form of multivariate data is a table or data frame. • Kinds of data • Unordered categorical variables (nominal data) • Ordinal data (numbered but not measured) • Interval data (measured data) • Ratio data (numerical with a defined ‘zero’) • Missing values (common)

5. Handling Missing Data • Ignore it. • Often biased. • Fill in plausible values • Known as imputation • Advanced topic • Be aware this is a problem area

6. Summary Statistics • Means • Generated by mean • Variances • Generated by var • Covariances • Also generated by var • Correlation coefficients • Generated by cor • Distances • Generated by dist

7. Aims • Data exploration (data mining) • Looking for non-random patterns and structures • Visual and graphical displays • Confirmatory analysis (later in the module) • Statistical testing

8. Looking at Multivariate Data • Scatterplots • Demonstration • “The convex hull of bivariate data” • Demonstration • Chiplot • Demonstration • BivariateBoxplot • Demonstration

9. More Multivariate Graphics • Bivariate Densities • Demonstration • Other Variables in a Scatterplot • Demonstration • Scatterplot Matrix • Demonstration of pairs • 3-D Plots • Demonstration • Conditioning Plots and Trellis Graphics • Demonstration

10. Summary • Most statistical data are multivariate. • Most multivariate data have structure. • Detecting that structure is what data mining is all about. • Most data mining involves data visualisation and graphing—nothing more. • Most of your conclusions from data mining will be obvious—once you see them! • And you really don’t need to learn very much statistics to be good at multivariate data analysis.

More Related