Trend analysis
E N D
Presentation Transcript
Trend analysis Diane Stockton
Introduction • Why do we want to look at trends over time? • To see how things have changed • What is the information used for? • Needs assessment • Programme planning • Programme evaluation • Policy development • To set targets for improving services or outcomes • To monitor progress against targets • To make predictions about the future
Analysis of time trends can focus on: • The overall pattern of change in an indicator over time • Comparing one time period to another time period (interrupted time series data) • Comparing one geographic area to another • Comparing one population to another • Making future projections
Important issues • Sample size • Presence of extreme observations • Availability of numerator and denominator data • Confounding
Statistical procedures • Linear regression • Log-linear regression • Logistic regression • Poisson regression • Time series (seasonal or otherwise) • Specialised models
What is a linear trend? • “Linear” means that the function we are looking for is a straight line • Change by 1 unit in one direction results in a change by some constant (m) amount.
When should we fit a regression line? • When looking back over the past to see how an observed indicator has changed over time • When the plotted values appear to be increasing or decreasing: • a) over the whole period • b) in a linear way, i.e. by the same value each year (e.g. a rate reducing by 100 per year) • If we want to calculate the rate of change, or assess whether the increase or decrease is significant • Although this is not a recommended method of forecasting future events, it may be used as a rough indicator of likely future changes
Is the change linear? Yes No
Fitting a linear regression line - Example y = 100 – 2x
Fitting a linear regression line • The ‘least squares’ method finds the straight line which fits the points most closely • Specifically, it finds the line which minimises the squared distances between the points and the line • Excel’s LINEST function calculates the intercept and the gradient • The chi-squared test assesses the significance of the trend
Fitting a linear regression line • BUT… • The data we most commonly deal with in public health are not usually linear
What about non-linear trends? • In public health we are most commonly dealing with counts, rates or proportions: we routinely transform the data in order to make the transformed data linear and constrain them to be no less than zero • We can use a log-transformation for counts or rates, fitting an exponential curve which assumes a constant rate of change, rather than a constant numerical increase or decrease • We can use a logit-transformation for proportions (or percentages), which constrains the variable to be between 0 and 1 (or 0% and 100%) • Excel’s LOGEST function calculates the intercept and gradient
When you need more complicated models • Fractional polynomials - fits curvi-linear line through a set of data points • Restricted cubic splines – similar to Fractional polynomials but you can choose number of “knots” (forces the line to be linear in the tails)
Assessing change in trend • Segmented regression – fit linear regression line before and after change point; test for step change and/or change in trend • Joinpoint regression – lets the model find where the trend changes – each change is called a joinpoint. Free software available to do joinpoint regression http://surveillance.cancer.gov/joinpoint/download.html
Other considerations • Other independent variables to be included in regression models • Achieving stability by combining data (eg. years or geographical areas) • Assessing goodness of fit
Projections / forecasting • How do we set and monitor progress against targets? • Plot historic data on a graph • We need to forecast or predict future data • In some circumstances we can extrapolate from a regression line or curve • There are much better forecasting methods • However… • Forecasts are usually wrong! • Accuracy erodes as we go further into the future • A good forecast is more than just a number • Includes an accuracy range
PopulationProjections • Population projections for Scotland are available from GROS http://www.gro-scotland.gov.uk/statistics/theme/population/projections/index.html They are available by NHS Board and Council area. • The projections look forward 25 years, providing an estimate of the number of males and females in each five-year age group, assuming the continuation of established trends
Prevalence ratio method • If we are trying to predict the total number of people with a particular need, this will depend partly on the rate of incidence or prevalence of the issue in question, and partly on the changing size and shape of the population • If we already have a population projection we can estimate future numbers by multiplying the extrapolated rate (or current rate if assuming no change) with the population projection
Prevalence ratio method example • An RNIB study found that 20% of people aged 75 or over were registered blind or partially sighted • If we assume that this rate remains fixed, projections can be obtained by multiplying the population projection for the 75+ age-group by 20% e.g. 2010 2015 2020 2025 75+ population 7,200 8,700 10,100 10,700 Projection in area X Rate of visual impairment 20% 20% 20% 20% in 75+ age-group (RNIB) Projected number with 1,440 1,620 1,740 2,020 visual impairment in 75+ age-group in area X
Prevalence ratio method - comments • Has modest data requirements: even if you only have a single estimate of the rate, you can make a simple projection on the assumption that the rate is fixed • Extrapolating trends are OK for a short time into the future as long as the historic data are stable • But: • The regression line is fitted across the whole of the historic data, and gives equal weight to all points: e.g. the value for last year is given the same weight as one from 20 years ago – it doesn’t give the best estimate of ‘current trends’ • We cannot give realistic confidence intervals for future values (‘prediction intervals’ or ‘forecast intervals’)
Age-period-cohort modelling • Used widely in cancer epidemiology where cohort effects are important • Estimates the underlying age-, period- and cohort-specific trends and models them into the future using a Poisson regression model • Programmes available: • “R” open source (free) software – NORDPRED programme available (http://www.kreftregisteret.no/en/Research/Projects/Nordpred/Nordpred-software/) • Stata – programmes available from http://www.encr.com.fr/stata-macros.htm
Other forecasting methods • There are a range of methods which are intended for forecasting, eg moving average (ARIMA) methods, autocorrelation methods, Box-Jenkins methods • These methods take into account fluctuations from year to year, trends (ie gradual changes over time) and seasonal variations • They tend to give greater weight to more recent values • They give confidence intervals for forecasts, which tend to get wider as we move further into the future • Holt’s Method (which includes a trend component) and Holt-Winters (which adds a seasonal component)
Holt’s method • Holt’s exponential smoothing (aka double exponential smoothing) is a moving average method • Several statistical packages will do this: • ForecastPro – expensive but very easy to use • Stata –requires code • R – open source (free) software which requires code • Excel – complicated to do • If you use Stata, R or Excel, you need to put some effort into optimising the parameters, which requires some expertise and time
Which method to use • APC models and Holt method appear to give similar results • Holt method has advantage of producing prediction intervals
Further information diane.stockton@nhs.net 0131 275 6817