1 / 24

BASIC STATISTICAL CONCEPTS

BASIC STATISTICAL CONCEPTS. Ocean is not “stationary”. “Stationary” - statistical properties remain constant in time. Data collected have signal and noise. Both signal and noise are assumed to have random behavior. Most basic descriptive parameter :. Sample Mean.

haile
Télécharger la présentation

BASIC STATISTICAL CONCEPTS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BASIC STATISTICAL CONCEPTS Ocean is not “stationary” “Stationary” - statistical properties remain constant in time Data collected have signal and noise Both signal and noise are assumed to have random behavior

  2. Most basic descriptive parameter : Sample Mean over the duration of a time series – “time average” or over an ensemble of measurements – “ensemble mean” It is an unbiased estimate of the population mean ‘’ The population mean, μ, can be regarded as the expected outcome E(y) of an event y. If the measurement is executed many times, μ would be the most common outcome, i.e., it’d be E(y) (e.g. the weight printed on a bag of chips)

  3. Sample Mean - locates center of mass of data distribution such that: Weighted Sample Mean relative frequency of occurrence of ith value

  4. Variance - describes spread about the mean or sample variability Sample variance Sample standard deviation Computationally more efficient (only one pass through the data) Population variance (unbiased) typical difference from the mean N needs to be > 1 to define variance and std dev Only for N < 30 s’ and are significantly different

  5. Sample variance has one degree of freedom (dof) < Population variance because we estimate population variance with sample variance (one less dependent measure) d.o.f. :  = # of independent pieces of data being used to make a calculation.  = measure of how certain we are that our sample population is representative of the entire population The larger  the more certain we are that we have sampled the entire population

  6. Other values of Importance 0.66 N = 1601 range (1.27) -0.61 Median – equal number of values above and below = -0.007 Mode – value occurring most often

  7. Mode = -0.3 Two Modes Bimodal

  8. Probability Provides procedures to infer population distribution from sample distribution and to determine how good the inference is The probability of a particular event to occur is the ratio of the number of occurrences of that event and the total number of occurrences for all possible events P (a dice showing ‘6’) = 1/6 0  P (x) 1 The probability of a continuous variable is defined by a PROBABILITY DENSITY FUNCTION -- PDF

  9. Probability is measured by the area underneath PDF

  10. Probability Density Function Gaussor Normalor Bell  3 2 68.3% 99.7% 95.4% 1

  11. Probability Density Function Gaussor Normalor Bell  3 2 standardized normal variable 68.3% 1 99.7% 95.4%

  12. Probability Density Function Gamma  = 1  = 1  = 2  = 3  = 4

  13. Probability Density Function Gamma  = 1  = 1  = 2  = 3  = 4

  14. Probability Density Function Chi Square  = 2  = /2  = 4  = 6 82  = 2 122 42 162  = 8

  15. CONFIDENCE INTERVALS Confidence Interval for  with  known For N > 30 (large enough sample) the 100 (1 - )% confidence intervalis: standardized normal variable /2 /2 1 - 

  16. (1 - /2) = 0.975 http://statistics.laerd.com/statistical-guides/normal-distribution-calculations.php

  17. 100 (1 - )% C.I.is: If  = 0.05, z/2 = 1.96 Suppose we have a CT sensor at the outlet of a spring into the ocean. We obtain a burst sample of 50 measurements, once per second, with a sample mean of 26.5 ºC and a stdev of 1.2 ºC for the burst. What is the range of possible values, at the 95% confidence, for the population mean?

  18. CONFIDENCE INTERVALS Confidence Interval for  with  unknown For N < 30 (small samples) the 100 (1 - )% confidence intervalis: Student’s t-distribution with  = (N-1) degrees of freedom /2 /2 1 - 

  19. /2 = 0.025 d.o.f.= 19

  20. 100 (1 - )% C.I.is: Suppose we do 20 CTD profiles at one station in St Augustine Inlet. We obtain a mean at the surface of 16.5 ºC and a stdev of 0.7 ºC . What is the range of possible values, at the 95% confidence, for the population mean? If  = 0.05, t0.025,19 = 2.093 /2 /2 1 - 

  21. CONFIDENCE INTERVALS Confidence Interval for 2 To determine reliability of spectral peaks Need to know C.I. for 2 on the basis of s2  = (N-1) degrees of freedom /2 1 -  /2

  22. 100 (1 - )% C.I.is: Suppose that we have  = 10 spectral estimates of a tidal record. The background variance near a distinct spectral peak is 0.3 m2 95% C.I. for variance? How large would the peak have to be to stand out, statistically, from background level? /2 /2 1 -  /2 = 0.025; 1 - /2 = 0.975 Look at Chi square table:

  23. The background variance lies in this range The spectral peak has to be greater than 0.92 m2 to distinguish it from background levels Chi Square Table

More Related