Créer une présentation
Télécharger la présentation

Télécharger la présentation
## Chapter 8: Data Management

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Chapter 8: Data Management**This chapter starts on page 366.**Chapter 8: Get Ready**• Before starting Chapter 8, we need to review these concepts: • Display data • Box and Whisker plots • Measures of central tendency • Interpolate and Extrapolate**8.1: Scatter plots**• Statistics Canada collects and organizes data to help Canadians better understand their country: its population, resources, economy, society and culture.**Scatter plots**• A scatter plot is a graph of ordered pairs of numeric data. • A scatter plot is used to see relationships between 2 variables or quantities.**The line of best fit**• The line of best fit is the line that passes through or near as many points as possible on a scatter plot.**An outlier**• An outlier is a data point that does not fit the pattern of the other data. • An outlier seems to be very different from most of the data in the scatter plot.**Interpolating values**• Interpolating data values from a graph means to estimate values between two known pieces of data.**Extrapolating data**• Extrapolating data values from a graph means to predict values beyond the collected data.**The Independent variable**• In a relation, the independent variable is the variable that determines the value of the dependent variable. • For example, with speed, distance/time, the time is the independent variable because the distance depends on time. • Usually, the independent variable is x**The Dependant variable**• In a relation, the dependent variable is the variable whose value is determined by the independent variable. • For example, with speed, distance/time, the distance is the dependent variable because the distance depends on the time for its value. • Usually, the dependent variable is y**There are 2 types of data:**Continuous data Discrete data The types of data**Continuous data**• Continuous data is a set of data where a variable can be any real number. • When the data points are joined together as a line, this represents continuous data. • Examples of continuous data are speed and temperature.**Discrete data**• Discrete data is a set of data where a variable must be a whole number. • When the data points are not joined together as a line, this represents discrete data. • Examples of discrete data are the number of pages in a book or the number of students in a class.**Correlations #1**• To better understand and organize a data set, Statistics Canada create scatter plots in order to determine a correlation between 2 variables.**Correlations #2**• A correlation is the measure of how closely the points on a scatter plot fit a line (i.e. the degree to which 2 quantities show a linear relationship)**The correlation between 2 variables can be:**Strong Weak Positive Negative Non-existent Adjectives that describe correlations**Strong correlation**• If most of the points are closely grouped around the line, then the correlation is strong.**Weak Correlation**• If the points are spread out but show a general trend, then the correlation is weak.**Positive correlation**• A positive correlation means that the relationship between the variables is positive. • As the independent variable increases, the dependent variable increases. • The slope of a line showing positive correlation is positive (the line goes up as you move left to right)**Negative correlation**• A negative correlation means that the relationship between the variables is negative. • As the independent variable increases, the dependent variable decreases. • The slope of a line showing negative correlation is negative (the line goes down as you move left to right)**Non-existent correlation**• If the points are spread out and show no general trend, then the correlation is non-existent.**A relationship**• A relationship is a pattern between 2 sets of numbers.**In Data Management, there are two types of Math**relationships: A linear relationship (it forms a straight line) A non-linear relationship (it does not form a straight line) The types of relationships**8.2: Assess data and make predictions**• To assess and analyze data, it is useful to display your data set as a scatter plot. • Then, trace the line of best fit for the data by inspection (by eye)**The goodness of fit of a line**• After drawing the line of best fit, it is necessary to judge its goodness of fit. • A correlation grid is a guide to indicate the goodness of fit for a line.**Here are the 6 types of data displays: (Grade 9)**A scatter plot A histogram A circle graph A stem-and-leaf plot A box-and-whisker plot A bar graph 8.3: Display data**A bar graph**• A bar graph is a diagram that displays data visually with vertical or horizontal bars. • Bar graphs are used to compare categories.**A circle graph**• A circle graph is a graph in which a circle representing the whole data is divided into sections. • Circle graphs are used to compare categories to each other and each category to the whole data set.**A stem-and-leaf plot**• A stem-and-leaf plot is a way of organizing numerical data by representing part of each number as a stem and the other part as a leaf. • Stem-and-leaf plots organize data based on place value.**A histogram**• A histogram is a connected bar graph that shows data organized into intervals. • Histograms organize data in intervals.**A box-and-whisker plot**• A box-and-whisker plot is diagram that shows the median and range of a numeric data set.**The use of box-and-whisker plots**• A box-and-whisker plot shows how data is dispersed or spread around the median of a data set.**The vocabulary of box-and-whisker plots**• The box of the graph contains or represents at least 50% of the data. • The least and greatest data values are called the minimum and maximum or the lower extreme and the upper extreme. • The lower quartile is the median value of the lower half of the data. • The upper quartile is the median value of the upper half of the data.**How to choose the most appropriate way to display your data**set • The most appropriate choice of data display depends on the type of data and the information you wish to convey.**Hints for choosing the best way to display your data #1**• Line graphs and scatter plots can be used to analyze trends. • Histograms, box-and-whisker plots and stem-and-leaf plots can be used to analyze the range of data spread, check where data is clustered and find the measures of central tendency.**Hints for choosing the best way to display your data #2**• Bar graphs and circle graphs are used to compare categories.**Measures of central tendency**• The measure of central tendency is a value that represents the centre of a set of data.**There are 3 types of measure of central tendency:**The mean The median The mode The types of measure of central tendency**The mean**• The mean is the sum of a set of values divided by the number of values in the set. • The advantages of the mean: Information is given about the sum of the values. • The disadvantages of the mean: Influenced by extreme data values.**The median**• The median is the middle value when a set of data is arranged in order from least to greatest. • Advantage of the median: Not greatly influenced by extreme data values. • Disadvantages of the median: No information is given about the sum of the values.**The mode**• The mode is the most common value in a set of data. • Advantage of the mode: Easy to locate in frequency tables, graphs, bar graphs or histograms. • Disadvantage of the mode: May change greatly with new data values.