AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH

AAEC 4302ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Chapter 11: Sampling Theory in Regression Analysis

Statistical Inference • Recall that the parameter estimates obtained by applying the OLS formulas are not equal to the true (population) model parameters . • Therefore, the estimated value of a parameter could be positive while its true population value could be zero or even negative.

Statistical Inference • In applied research, it is important to know: • How close the estimated value of a given parameter is, to its true, unknown population value?

The Normal Regression Model • The basic model of simple linear regression states: for a given set of values X, the corresponding values of Y are determined by: • Two parts indetermination of Y: systematic portion and random portion, the disturbance Ui • Ui is a random variable with normal probability distribution E(ui) = 0 and σ(ui) = σu

The Normal Regression Model • The statistical inference procedures explained in this chapter are only appropriate if: • ui has a normal distribution with mean zero: E[ui]=0 and • Constant variance: σ2(ui)= σ2

The Normal Regression Model • Since Yi and ui only differ by a constant, the former implies that the dependent variable also follows a normal probability distribution with a changing mean and its standard deviation is σ(Yi) = σu

The Normal Regression Model • ui is normally distributed with E(ui)=0 and σ(ui)= 5 • If Xi = 5, What can you say about Yi? • Yi is normally distributed with: • E[Yi] = 7 + 12(5) = 67 • σ(Yi) = σ(ui) = 5

The Normal Regression Model P(Yi) Yi~ N[67,(5)2] σ(Yi) = σ = 5 Mean: β0+β1X1 E[Yi]=67 -σ +σ Yi 67 72 62

The Normal Regression Model P(ui) ui~ N[0,(5)2] ui E[ui]=0

The Normal Regression Model • Two assumptions about the relations among the disturbances for different observations: • The ui are independent • The value of the disturbance from one observation in no way affects the value that occurs for another. • σ(ui) = σuthe same for all observations

The Normal Regression Model • The two previous assumptions imply that the disturbance values for the different observations can be thought of as different values drawn from the same random variable U, which has a normal probability distribution, mean equal to 0, and standard deviation equal to σu • ui ~N[0, σ2]

The Normal Regression Model Yi E[Y2]=91 Y2=89 E[Yi]=β0+β1X1 Y1=69 E[Y1]=67 X X=5 X=7 Sample

The Normal Regression Model

Sampling Distribution of OLS Formulas • If several samples of the same size are taken, and the OLS formulas (i.e. estimators) are used to calculate the corresponding values (estimates) for a given model parameter , they will likely be all different from each other.

Sampling Distribution of OLS Formulas • Those estimated parameter values represent the probability distribution of the OLS estimator for • Under the formerly stated assumptions about the probability distribution of the error term, the OLS estimator for βj is also normally distributed with mean βj and variance denoted by σ2.

Sampling Distribution of OLS Formulas • In the simple linear regression model: where  means “distributed”, N means normal, the first element in parenthesis is the mean or expected value of the estimator and the second element is the formula for calculating the variance of the estimator. 2 æ ö æ ö ΣX ç s ÷ ç ÷ 2 β ~ N β , i ç ÷ ç ÷ ( ) 2 0 0 - n X X å è ø è ø i

Sampling Distribution of OLS Formulas • The formulas for calculating the variances are different in the case of the multiple regression model • The square roots of the variances of the estimators are usually called the standard errors of the estimators

Calculating the S.E. of the Estimators • To calculate the standard error of the estimators, 2 has to be substituted by an estimate of the error term variance, specifically the SER2; therefore:

Calculating the S.E. of the Estimators • The standard error of the estimator is the standard error of . • The expression , which appears in is known as the total variation in X.

Calculating the S.E. of the Estimators Example: • σu =5, β0 =7 and β1 =12 • Assume the total variation in X equals 9

Calculating the S.E. of the Estimators What is the chance that β1 is between 11 & 13? α = P(11≤β1≤13) = 1-2P(β1≥13) = 1-2P(Z≥Zk) where =1-2P(Z≥0.6) = 1-(2)(0.274) = 0.452 Thus, the probability of α is about 45 percent. ^ ^

Calculating the S.E. of the Estimators • For a set of data for which total variation in X is equal to 25 • Standard Error for this case σ(β1) = 1 • The probability for this case α = P(11≤β1≤13) = 0.68 When Standard Error is smaller there is a greater possibility that est. β1 will take on a value in some interval centered around true β1 valueThe smaller the standard error, the more precise is est. β1 as an estimator of β1 • The greater is the total variation in X, the smaller will be the standard error

AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH