1 / 122

PCB 3043L - General Ecology

PCB 3043L - General Ecology. Data Analysis. OUTLINE. Organizing an ecological study Basic sampling terminology Statistical analysis of data Why use statistics? Describing data Measures of central tendency Measures of spread Normal distributions Using Excel Producing tables

trudy
Télécharger la présentation

PCB 3043L - General Ecology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PCB 3043L - General Ecology Data Analysis

  2. OUTLINE • Organizing an ecological study • Basic sampling terminology • Statistical analysis of data • Why use statistics? • Describing data • Measures of central tendency • Measures of spread • Normal distributions • Using Excel • Producing tables • Producing graphs • Analyzing data • Statistical tests • T-Tests • ANOVA • Regression

  3. Organizing an ecological study • What is the aim of the study? • What is the main question being asked? • What are your hypotheses? • Collect data • Summarize data in tables • Present data graphically • Statistically test your hypotheses • Analyze the statistical results • Present a conclusion to the proposed question

  4. Basic sampling terminology • Variables • Populations • Samples • Parameters • Statistics

  5. What is a variable? • Variable: any defined characteristic that varies from one biological entity to another. • Examples: plant height, bird weight, human eye color, no. of tree species • If an individual is selected randomly from a population, it may display a particular height, weight, etc. • If several individuals are selected, their characteristics may be very similar or very different.

  6. Types of variables • Nominal data (e.g. plant height: tall, short) • Ratio data (e.g. plant height: 16.3cm, 2.3cm) • Ordinal data (e.g. plant height: A = 1 to 10cm, B = 10 to 20 cm, C = 20 to 30cm)

  7. What is a population? • Population: the entire collection of measurements of a variable of interest. • Example: if we are interested in the heights of pine trees in Everglades National Park (Plant height is our variable) then our population would consist of all the pine trees in Everglades National Park .

  8. What is a sample? • Sample: smaller groups or subsets of the population which are measured and used to estimate the distribution of the variable within the true population • Example: the heights of 100 pine trees in Everglades National Park may be used to estimate the heights of trees within the entire population (which actually consists of thousands of trees)

  9. What is a parameter? • Parameter: any calculated measure used to describe or characterize a population • Example: the average height of pine trees in Everglades National Park

  10. What is a statistic? • Statistic: an estimate of any population parameter • Example: the average height of a sample of 100 pine trees in Everglades National Park

  11. Why use statistics? • It is not always possible to obtain measures and calculate parameters of variables for the entire population of interest • Statistics allow us to estimate these values for the entire population based on multiple, random samples of the variable of interest • The larger the number of samples, the closer the estimated measure is to the true population measure • Statistics also allow us to efficiently compare populations to determine differences among them • Statistics allow us to determine relationships between variables

  12. Statistical analysis of data Heights of pine trees at 2 sites in Everglades National Park • Measures of central tendency • Measures of dispersion and variability

  13. Measures of central tendency • Where is the center of the distribution? mean ( or μ): arithmetic mean…… median: the value in the middle of the ordered data set mode: the most commonly occurring value Example data set : 1, 2, 2, 2, 3, 5, 6, 7, 8, 9, 10 Mean = (1 + 2 + 2 + 2+ 3 + 5 + 6 + 7 + 8 + 9 + 10)/10 = 55/10 = 5.5 Median = 1, 2, 2, 2, 3, 5, 6, 7, 8, 9,10 = 5 1, 2, 2, 2, 3, 5, 6, 7, 8, 9,10, 11 = (5+6)/2 = 5.5 Mode = 1, 2, 2, 2, 3, 5, 6, 7, 8, 9, 10 = 2

  14. Large spread Small spread Measures of dispersion and variability • How widely is the data distributed? range: largest value minus smallest value variance (s2 or σ2) ………….…………. standard deviation (s or σ)…………………

  15. Measures of dispersion and variability Example data set: 0, 1, 3, 3, 5, 5, 5, 7, 7, 9, 10Variance = 9.8Standard Deviation = 3.29Range = 10Example data set: 0, 10, 30, 30, 50, 50, 50, 70, 70, 90, 100Variance = 980Standard Deviation = 270.13Range = 100

  16. Normal distribution of data • A data set in which most values are around the mean, with fewer observations towards the extremes of the range of values • The distribution is symmetrical about the mean

  17. Proportions of a Normal Distribution • A normal population of 1000 body weights • μ = 70kg σ = 10kg • 500 weights are > 70kg • 500 weights are < 70 kg

  18. Z = X – μ σ Z = 80 – 70 10 Proportions of a Normal Distribution • How many bears have a weight > 80kg • μ = 70kg σ = 10kg X = 80kg • We use an equation to tell us how many standard deviations from the mean the X value is located: = = • We then use a special table to tell us what proportion of a normal distribution lies beyond this Z value • This proportion is equal to the probability of drawing at random a measurement (X) greater than 80kg 1

  19. Z table • Look for Z value on table (1.0) • Find associated P value (0.4960) • P value states there is a 49.6% ((0.4960/1)x100) chance that a bear selected from the population of 1000 bears measured will have a weight greater than 80kg

  20. Probability distribution tables • There are multiple probability tables for different types of statistical tests. e.g. Z-Table, t-Table, Χ2-Table • Each allows you to associate a “critical value” with a “P value” • This P value is used to determine the significance of statistical results

  21. Using Excel • Program used to organize data • Produce tables • Perform calculations • Make graphs • Perform statistical tests

  22. Organizing data in tables • Allows you to arrange data in a format that is best for analysis • The following are the steps you would use:

  23. Performing calculations • Allows you to perform several calculations • Sum, Average, Variance, Standard deviation • Basic subtraction, addition, multiplication • More complex formulas

  24. Making graphs • Bar Charts……. • Scatter Plots………………….

More Related