1 / 15

DATA ANALYSIS

DATA ANALYSIS. Module Code: CA660 Lecture Block 6: Alternative estimation methods and their implementation. MAXIMUM LIKELIHOOD ESTIMATION. Recall general points: Estimation, definition of Likelihood function for a vector of parameters  and set of values x .

vashon
Télécharger la présentation

DATA ANALYSIS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DATA ANALYSIS Module Code: CA660 Lecture Block 6: Alternative estimation methods and their implementation

  2. MAXIMUM LIKELIHOOD ESTIMATION • Recallgeneral points: Estimation, definition of Likelihood function for a vector of parameters  and set of values x. • Find most likely value of  = maximise the Likelihood fn. • Also defined Log-likelihood(Support fn. S()) and its derivative, the Score, together with Information content per observation, which for single parameter likelihood is given by • Why MLE? (Need to know underlying distribution). • Properties: Consistency; sufficiency; asymptotic efficiency(linked to variance);unique maximum; invarianceand, hence most convenientparameterisation; usually MVUE; amenableto conventional optimisation methods.

  3. VARIANCE, BIAS & CONFIDENCE • Variance of an Estimator - usual form or • for k independent estimates • For a large sample, variance of MLE can be approximated by • can also estimate empirically, using re-sampling*techniques. • Variance of a linear function (of several estimates) – (common need in genomics analysis, e.g. heritability), in risk analysis • RecallBiasof the Estimator • then the Mean Square Erroris defined to be: • expands to • so we have the basis for C.I.andtests of hypothesis.

  4. COMMONLY-USED METHODS of obtaining MLE • Analytical - solvingor when simple solutions exist • Grid search or likelihood profile approach • Newton-Raphson iteration methods • EM (expectation and maximisation) algorithm • N.B. Log.-likelihood, because max. same  value as Likelihood • Easier to compute • Close relationship between statisticalproperties of MLE • and Log-likelihood

  5. MLE Methods in outline • Analytical : - recall Binomial example earlier • Example : For Normal, MLE’s of mean and variance, (taking derivatives w.r.t mean and variance separately), and equivalent to sample mean and actual variance (i.e. /N), • - unbiased if mean known, biased if not. • Invariance : One-to-one relationships preserved • Used: whenMLE has a simple solution

  6. MLE Methods in outline contd. • Grid Search – Computational • Plot likelihood or log-likelihood vs parameter. Various features • Relative Likelihood=Likelihood/Max. Likelihood (ML set =1). • Peak of R.L. can be visually identified /sought algorithmically. e.g. • Plot likelihood and parameter space range - gives 2 peaks, symmetrical around ( likelihood profile for e.g. well-known mixed linkage analysis problem. Or for similar example of populations following known proportion splits). • If now constrain MLE solution unique e.g.= R.F. between genes (possible mixed linkage phase).

  7. MLE Methods in outline contd. • Graphic/numericalImplementation - initial estimate of . Directionofsearch determined by evaluating likelihood to both sides of . Search takes direction giving increase, because looking for max. Initial search increments large, e.g. 0.1, then when likelihood change starts to decrease or become negative, stop and refineincrement. Issues: • Multiple peaks– can miss global maximum, computationally intensive ; see e.g. http://statgen.iop.kcl.ac.uk/bgim/mle/sslike_1.html • Multiple Parameters- grid search. Interpretation of Likelihood profiles can be difficult, e.g. http://blogs.sas.com/content/iml/2011/10/12/maximum-likelihood-estimation-in-sasiml/

  8. Example in outline • Data e.g used to show a linkagerelationship (non-independence) between e.g. marker and a given disease gene, or (e.g. between sex and purchase) of computer games. • Escapes = individuals who are susceptible, but show no disease phenotype under experimental conditions: (express interest but no purchase record). So define as proportion of escapes and R.F. respectively. • is penetrancefor disease trait or of purchasing, i.e. • P{ that individual with susceptible genotype has disease phenotype}. • P{individual of given sex and interested who actually buys} • Purpose of expt.-typically to estimate R.F. between marker and gene or proportion of a sex that purchases • Use: Support function = Log-Likelihood. Often quite complex, e.g. for above example, might have

  9. Example contd. • Setting 1st derivatives (Scores) w.r.t and w.r.t. • Expected value of Score (w.r.t. is zero, (see analogies in classical sampling/hypothesis testing). Similarly for . Here, however, Nosimple analytical solution, so can not solve directly for either. • Using grid search, likelihood reaches maximum at e.g. • In general, this type of experiment tests H0: Independence between the factors (marker and gene), (sex and purchase) • and H0: no escapes • Uses LikelihoodRatioTeststatistics. (M.L.E. 2 equivalent)

  10. MLE Methods in outline contd. • Newton-Raphson Iteration • Have Score () = 0 from previously.N-R consists of replacing Score by linear terms of its Taylor expansion, so if ´´ a solution,  ´=1st guess • Repeatwith  ´´replacing´ • Each iteration - fits a parabolato • Likelihood Fn. • Problems- Multiple peaks, zero Information, extreme estimates • Multiple parameters– need matrix notation, where S matrix e.g. has elements = derivatives of S(, ) w.r.t.  and  respectively. Similarly, Information matrix has terms of form •  Estimates are L.F. 2nd 1st  Variance of Log-L i.e.S()

  11. MLE Methods in outline contd. • Expectation-Maximisation Algorithm- Iterative. Incompletedata • (Much genomic, financial and other data fit this situation e.g.linkage analysis with marker genotypes of F2 progeny. Usually 9 categories observed for 2-locus, 2-allele model, but 16 = complete info., while 14 give info. on linkage. Some hidden, but if linkage parameter known, expected frequencies can be predicted and the complete data restored using expectation). • Steps: (1)Expectationestimates statistics of complete data, given observed incomplete data. • -(2) Maximisationuses estimated complete data to give MLE. • Iterate till converges (no further change)

  12. E-M contd. • Implementation • Initial guess, ´, chosen (e.g. =0.25 say = R.F.). • Taking this as “true”, complete data is estimated, by distributional statements e.g. P(individual is recombinant, given observed genotype) for R.F. estimation. • MLE estimate ´´ computed. • This, for R.F.  sum of recombinants/N. • Thus MLE, for fi observed count, • Convergence ´´ = ´ or

  13. LIKELIHOOD : C.I. and H.T. • Likelihood Ratio Tests– c.f. with 2. • Principal Advantage of GisPower, as unknown parameters involved in hypothesis test. • Have : Likelihood of taking a value Awhich maximises • it, i.e. its MLE and likelihood under H0 : N , (e.g.N = 0.5) • Form of L.R. Test Statistic • or, conventionally • - choose; easier to interpret. • Distribution of G~ approx. 2 (d.o.f. = difference in dimension of parameter spaces for L(A), L(N) ) • Goodness of Fit:notation as for 2 , G ~ 2n-1 : • Independence:notation again as for2

  14. Likelihood C. I.’s – graphical method • Example: Consider the following Likelihood function •  is the unknown parameter ; a, b observed counts • For 4 data sets observed, • A: (a,b) = (8,2), B: (a,b)=(16,4) C: (a,b)=(80, 20) D: (a,b) = (400, 100) • Likelihood estimates can be plotted vs possible parameter values, with MLE = peak value. • e.g.MLE = 0.2, Lmax=0.0067 for A, and Lmax=0.0045 for B etc. • SetA: Log Lmax- Log L=Log(0.0067) - Log(0.00091)= 2gives 95% C.I. • so  =(0.035,0.496) corresponding to L=0.00091,  95% C.I. for A. • Similarly, manipulating this expression, Likelihood value corresponding to  95% confidence interval given as L = (7.389)-1Lmax • Note: Usually plot Log-likelihood vs parameter, rather than Likelihood. • As sample size increases, C.I. narrowerand  symmetric

  15. Maximum Likelihood Benefits • Strong estimator properties – sufficiency, efficiency, consistency, non-bias etc. as before • Good Confidence Intervals • Coverage probability realised and intervals meaningful • MLE Good estimator of a CI • MSEconsistent • Absence of Bias • - does not “stand-alone” – minimum variance important • Asymptotically Normal • Precise – large sample • Inferences valid, ranges realistic

More Related