1 / 21

# Data Analysis

Data Analysis. Basic Problem. There is a population whose properties we are interested in and wish to quantify statistically: mean, standard deviation, distribution, etc. The Question – Given a sample, what was the random system that generated its statistics?. Central Limit Theorem. Télécharger la présentation ## Data Analysis

E N D

### Presentation Transcript

1. Data Analysis

2. Basic Problem • There is a population whose properties we are interested in and wish to quantify statistically: mean, standard deviation, distribution, etc. • The Question – Given a sample, what was the random system that generated its statistics?

3. Central Limit Theorem • If one takes random samples of size n from a population of mean m and standard deviation s, then as n gets large, approaches the normal distribution with mean m and standard deviation • s is generally unknown and often replaced by the sample standard deviation s resulting in , which is termed the Standard Error of the sample.

4. Example

5. Normal distribution

6. Critical Values for Confidence Levels

7. Student’s t-distribution

8. Critical Values for Confidence Levelst-distribution

9. Confidence Interval for Mean(small sample size, t-distribution) OR

10. Comparing Population Means Unequal Variance Pooled Variance

11. Hypothesis Testing (t-test) • Null Hypothesis – differences in two samples occurred purely by chance • t statistic = (estimated difference)/SE • Test returns a “p” value that represents the likelihood that two samples were derived from populations with the same distributions • Samples may be either independent or paired

12. Tails • One tailed test – hypothesis is that one sample is: less than, greater than, taller than, • Two tailed test – hypothesis is that one sample is different (either higher or lower) than the other

13. Paired Test • Samples are not independent • Much more robust test to determine differences since all other variables are controlled • Analysis is performed on the differences of the paired values • Equivalent to Confidence interval for the mean

14. Paired Samples – New Site

15. TSS Concentrations vs. Time

16. BMP Performance Comparison • Commonly expressed as a % reduction in concentration or load • Highly dependent on influent concentration • Potentially ignores reduction in volume (load) • May lead to very large differences in pollutant reduction estimates • Preferable to compare discharge concentrations

17. Effect of TSS Influent Concentration

18. Sand Filter - TSS

19. Comparison of Effluent Quality

20. Exercise • Calculate average concentrations for each constituent for the two watersheds • Determine whether any concentrations are significantly different, report p value for null hypothesis • Calculate average effluent concentrations for the two BMPs and determine whether they are different from the influent concentrations – p values • Compare effluent concentrations for the two BMPs and determine whether one BMP is better than the other for a particular constituent.

More Related