380 likes | 1.15k Vues
Maximum-Likelihood estimation. Consider as usual a random sample x = x 1 , … , x n from a distribution with p.d.f. f ( x ; ) (and c.d.f. F ( x ; ) )
E N D
Maximum-Likelihood estimation Consider as usual a random sample x = x1, … , xnfrom a distribution with p.d.f. f (x; ) (and c.d.f. F(x; ) ) The maximum likelihood point estimatorof is the value of that maximizes L( ; x) or equivalently maximizes l( ; x) Useful notation: With a k-dimensional parameter:
Complete sample case: If all sample values are explicitly known, then Censored data case: If some ( say nc)of the sample values are censored , e.g. xi< k1 or xi> k2 , then where
When the sample comes from a continuous distribution the censored data case can be written In the case the distribution is discrete the use of F is also possible: If k1 and k2 are values that can be attained by the random variables then we may write where
Example: Solution must be numerically found
For the exponential family of distributions: Use the canonical form (natural parameterization): Let Then the maximum likelihood estimators (MLEs) of 1, … , k are found by solving the system of equations
Computational aspects • When the MLEs can be found by evaluating • numerical routines for solving the generic equation g( ) = 0 can be used. • Newton-Raphson method • Fisher’s method of scoring (makes use of the fact that under regularity conditions: • ) • This is the multidimensional analogue of Lemma 2.1 ( see page 17)
When the MLEs cannot be found the above way other numerical routines must be used: • Simplex method • EM-algorithm • For description of the numerical routines see textbook. • Maximum Likelihood estimation comes into natural use not for handling the standard case, i.e. a complete random sample from a distribution within the exponential family , but for finding estimators in more non-standard and complex situations.
Properties of MLEs Invariance: Consistency: Under some weak regularity conditionsall MLEs are consistent Efficiency: Under the usual regularity conditions: (Asymptotically efficient and normally distributed)
Sufficiency: Example:
i.e. the two MLEs are asymptotically uncorrelated (and by the normal distribution independent)
Modifications and extensions Ancillarity and conditional sufficiency:
Profile likelihood: This concept has its main use in cases where 1 contains the parameters of “interest” and 2 contains nuisance parameters. The same ML point estimator for 1 is obtained by maximizing the profile likelihood as by maximizing the full likelihood function
Marginal and conditional likelihood: Again, these concepts have their main use in cases where 1 contains the parameters of “interest” and 2 contains nuisance parameters.
Penalized likelihood: MLEs can be derived subjected to some criteria od smoothness. In particulare this is applicable when the parameter is no longer a single value (one- or multidimensional), but a function such as an unknown density function or a regression curve. The penalized log-likelihood function is written
The method of moments point estimator of = ( 1, … , k ) is obtained by solving for 1, … , k the systems of equations
Method of Least Squares (LS) First principles: Assume a sample xwhere the random variable Xi can be written The least-squares estimator of is the value of that minimizes i.e.
A more general approach: Assume the sample can be written (x, z ) where xirepresents the random variable of interest (endogenous variable) and zi represent either an auxiliary random variable (exogenous) or a given constant for sample point i The least squares estimator of is then
Special cases: The ordinary linear regression model: The heteroscedastic regression model:
The first-order auto-regressive model: The conditional least-squares estimator of (given ) is