1 / 19

BIOL 2608 Biometrics

BIOL 2608 Biometrics. Descriptive Statistics. Measurement Theory. Biologists use measurements routinely in Lab or field work by assigning numbers or groups (classes) Apply mathematical operations to the data e.g. Predict fish mass by their length through an established regression

tom
Télécharger la présentation

BIOL 2608 Biometrics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BIOL 2608 Biometrics Descriptive Statistics

  2. Measurement Theory • Biologists use measurements routinely in Lab or field work by assigning numbers or groups (classes) • Apply mathematical operations to the data • e.g. Predict fish mass by their length through an established regression • Different levels of measurement: • nominal, ordinal, interval scale, ratio

  3. 1 2 3 0 1 10 100 1000 0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0

  4. Nominal • Latin word = name • Classificatory level: numbers or other symbols used to classify an object or a character into a range of categories • Genetic phenotypes (e.g. eye colour) • Taxonomic categories (e.g. molluscs or insects) • Gross relative scale (e.g. long vs. short) • The weakest level of measurement because the counts (or frequencies) obtained convey limited information • Descriptive statistics: mode (the most frequency class)

  5. Ordinal • Latin word = Order • Provide more information than Nominal measurements as ordinal measurements can be Ranked • very slow to very fast (shorter/ darker/ more active etc.) • exam results (A, B, C, D) • cell size (1 to 5); colours of egg yolk • Statistics: mode, median (the middle value, used as a measure of “central tendency”), inter-quartile range (a measure of dispersion) Picture source: http://www.goldeneggs.com.au/farm_to_family/haugh.html

  6. Interval Scale • A scale is involved and the distance between any 2 numbers on the scale is a known or measurable quantity • Time, temperature, salinity, length, height, weight, pH, concentrations • For temperature, Celsius and Fahrenheit: • Difference between 20oC (68oF) and 25oC (77oF) = difference between 5oC (41oF) and 10oC (50oF) = 10oC or 9oF • But, 40oC (104oF) is not twice as hot as 20oC (68oF); i.e. the zero point is arbitrary • Kelvin (K) has a physically meaningful zero and constant scale, and thus referred as a ratio scale

  7. Ratio Scale • Ratio scale: measurement scales having a constant interval size and a true zero point • Subjected to arithmetic procedures • e.g. weight measure • 100 kg increased to 110 Kg • 10% increase • e.g. temperature in Kelvin (but not in C or F) • For both interval & ratio scales, descriptive statistics are mean (for a measure of location) and standard deviation (for a measure of dispersion)

  8. Discontinuous Measurements • Whole number scales • Counts of objects, attributes or characters • e.g. no. of egg laid per female hen • e.g. no. of leave on a plant Continuous Measurements • Any values can be taken from a continuous scale • If ever, possible to measure continuous variable with great precision and the last digit of the values implies the precision achieved • e.g. 14.7 g implies the true mass lies between 14.65 and 14.75 g

  9. Measurements of Location mean Mean = Sum of values/n = Xi/n e.g. length of 8 fish larvae at day 3 after hatching: 0.6, 0.7, 1.2, 1.5, 1.7, 2.0, 2.2, 2.5 mm mean length = (0.6+0.7+1.2+1.5+1.7+2.0+2.2+2.5)/8 = 1.55 mm 0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 mm

  10. mean median 0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 mm Median, Percentiles and Quartiles • Order = n/2 for n is an odd number • Order = (n+1)/2 for n is an even number e.g. 0.6, 0.7, 1.2, 1.5, 1.7, 2.0, 2.2, 2.5 mm order 1 2 3 4 5 6 7 8 order = (8+1)/2 = 4.5 Median = 50th percentile = (1.5 + 1.7)/2 = 1.6 mm order for Q1 = 25th percentile = (8+1)/4 = 2.25 then Q1 = 0.7 + (1.2 - 0.7)/4 = 0.825 mm

  11. mean median 0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 mm • Median is often used with mean • Mean is used much more frequent, however, • Median is a better measure of central tendency for data with skewed distribution or outliers mean median 0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 mm

  12. Other measures of central tendency • Range midpoint or range = (Max value - Min value)/2 • not a good estimate of the mean and seldom-used • Geometric mean = n(x1x2 x3 x4….xn) = 10^[mean of log10(xi)] • Only for positive ratio scale data • If data are not all equal, geometric mean < arithmetic mean • Use in averaging ratios where it is desired to give each ratio equal weight

  13. Measurements of dispersion Range e.g. length of 8 fish larvae at day 3 after hatching: 0.6, 0.7, 1.2, 1.5, 1.7, 2.0, 2.2, 2.5 mm Range = 2.5 - 0.6 = 1.9 mm (or say from 0.6 t 2.5mm) Percentile and quartiles

  14. Population Standard Deviation () • Averaged measurement of deviation from mean xi - x • e.g. five rainfall measurements, whose mean is 7 Rainfall (mm) xi - x (xi - x)2 12 12 - 7 = 5 25 0 0 - 7 = -7 49 2 2 - 7 = -5 25 5 5 - 7 = -2 4 16 16 - 7 = 9 81 Sum = 184 Sum = 184 • Population variance: 2 =  (xi - x)2/n = 184/5 = 36.8 • Population SD:  = (xi - x)2/n = 6.1

  15. Sample SD (s) s = [(xi - x)2]/ (n - 1) s = [xi2 – ((xi)2 /n)]/ (n - 1) • Two modifications: • by dividing [(xi - x)2] by (n -1) rather than n, gives a better unbiased estimate of  (however, when n increases, difference between s and  declines rapidly) • the sum of squared deviations can be calculated as  (xi2)- ( xi)2/ n

  16. Sample SD (s) • e.g. five rainfall measurements, whose mean is 7 Rainfall (mm) xi2 xi 12 144 12 0 0 0 2 4 2 5 25 5 16 256 16 (xi2) = 429 xi = 35 (xi)2 = 1225 • s2 = [xi2 - (xi)2 /n]/ (n - 1) = [429 - (1225/5)]/ (5 - 1) = 46.0 • s = (46.0) = 6.782

  17. Practical (5 min.): • Calculate the median, mean and standard deviation for the life expectancy (days) of two species of insects in captivity • Species A: 34, 36, 37,39, 40, 41, 42, 43, 79 days • Species B: 34, 36, 37, 39, 40, 41, 42, 43, 44, 45 days • Use your calculator with the equations as well as the calculator’s statistical functions

  18. Important Note • Measurements are taken at 4 levels: nominal, ordinal, interval and ratio • Nominal level measurements classify the characteristics of a sample • Ordinal level measurements allow observations to be ranked in order from lowest to highest • Interval level measurements are placed on a scale of values, where the scale represents an accurately measurable quantity • Ratio level measurements are interval level measurements with a true zero point, e.g. the Kelvin temperature scale

  19. Important Note • Measures of location provide a central value for observations and include the median and mean • Measures of dispersion describe the range of observations and include the absolute range, the interquartile range and the standard deviation(s) • The mean and the standard deviation are the most important descriptive statistics

More Related