Statistics

Statistics – The Science of Data Descriptive Statistics Summarises and displays information from a dataset Inferential Statistics Uses samples data to make decisions or predictions about a larger population of data Population: The entire collection of individuals or objects about which information is desired Sample: A part (subset) of the population selected in some prescribed manner.

Collecting data involves selecting a sampling method, designing an experiment or questionnaire/survey, selecting who collects the data and what they know about the aims of the research, where, when and how.

Confounding factors! Women don’t speed as much as men in their cars What possible confounding factors can you think of? Women statistically more likely to have smaller cars? More men actually driving means more speeders?

Graphs Central Tendency

The Range The smallest score subtracted from the largest Example Number of friends of 11 Facebook users. 22, 40, 53, 57, 93, 98, 103, 108, 116, 121, 252 Range = 252 – 22 = 230 Very biased by outliers

Quartiles The three values that split the sorted data into four equal parts. Second Quartile = median. Lower quartile = median of lower half of the data Upper quartile = median of upper half of the data

...available online

Think of this as the average distance from the mean, EXCEPT!!! We actually measure the average squared distance from the mean, for technical reasons. Imagine any data in the set as xi The mean of the set is x The squared distance of the point to the mean is (Xi – x) 2

An example of a (online) probability calculator http://davidmlane.com/hyperstat/z_table.html

A Bimodal Distribution

One way to think about variability is in terms of the spread of data. Are all the values close to the mean or average, or are they more spread out across a wide range? We could look at the spread through percentiles E.g. The average score of the top 10% Or we could look at how far, on average, the cases differ from the average score. This is called the standard deviation.

However, for technical reasons, we use n-1 instead of n in the denominator (the bottom part of the fraction) to give , to define the sample variance.

To convert back into the same units as the dataset, we take the square root of the previous number as Standard Deviation.

Types of Data Analysis

Theories An hypothesized general principle or set of principles that explain known findings about a topic and from which new hypotheses can be generated. Hypothesis A prediction from a theory. E.g. the number of people turning up for a Big Brother audition that have narcissistic personality disorder will be higher than the general level (1%) in the population. Falsification The act of disproving a theory or hypothesis.

Cause and Effect (Hume, 1748) Cause and effect must occur close together in time (contiguity); The cause must occur before an effect does; The effect should never occur without the presence of the cause. Confounding variables: the ‘Tertium Quid’ A variable (that we may or may not have measured) other than the predictor variables that potentially affects an outcome variable. E.g. The relationship between breast implants and suicide is confounded by self esteem. Ruling out confounds (Mill, 1865) An effect should be present when the cause is present and that when the cause is absent the effect should be absent also. Control conditions: the cause is absent.

Between-group/Between-subject/independent Different entities in experimental conditions Repeated measures (within-subject) The same entities take part in all experimental conditions. Economical Practice effects Fatigue

Systematic Variation Differences in performance created by a specific experimental manipulation. Unsystematic Variation Differences in performance created by unknown factors. Age, Gender, IQ, Time of day, Measurement error etc. Randomization Minimizes unsystematic variation.

Statistics – The Science of Data Descriptive Statistics Summarises and displays information from a dataset Inferential S

Statistics – The Science of Data Descriptive Statistics Summarises and displays information from a dataset Inferential S

Presentation Transcript

For Chapter Statistics Administrators

Why Statistics?

BUS7010 Applied Business Statistics

Descriptive Statistics Introduction to Summary Statistics

Spatial Statistics III

Review of Statistics 101

Chapter Eight: Using Statistics to Answer Questions

Review for Final Exam

Bivariate Statistics and Linear Regression

As much as I can say about Statistics in 60 minutes …

Lectures ( Biostatistics)

Descriptive Statistics

“ Adolphe Quetelet: Statistics and Social Science in the Early 19 th Century”

Statistics in Medicine

BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition

Chapter 1: Looking at Data: Distributions

Welcome to AP Stats!

BUSINESS STATISTICS

Descriptive Statistics-II

Univariate Statistics

Computational Statistics – Graphical and Analytic Methods for Streaming Data