1 / 46

Machine Learning – Expectation maximization Wilson Mckerrow ( Fenyo lab postdoc)

Machine Learning – Expectation maximization Wilson Mckerrow ( Fenyo lab postdoc). Contact: Wilson.McKerrow@nyulangone.org. Maximum likelihood estimation (MLE). Maximum likelihood estimation (MLE).

carnig
Télécharger la présentation

Machine Learning – Expectation maximization Wilson Mckerrow ( Fenyo lab postdoc)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Machine Learning – Expectation maximizationWilson Mckerrow (Fenyo lab postdoc) Contact: Wilson.McKerrow@nyulangone.org

  2. Maximum likelihood estimation (MLE)

  3. Maximum likelihood estimation (MLE) • From a certain family of distributions (e.g. normal distributions) want to pick the the distribution that best describes that data.

  4. Maximum likelihood estimation (MLE) • From a certain family of distributions (e.g. normal distributions) want to pick the the distribution that best describes that data. • Define an appropriate loss function?

  5. Maximum likelihood estimation (MLE) • From a certain family of distributions (e.g. normal distributions) want to pick the the distribution that best describes that data. • Define an appropriate loss function? • How about pick the distribution that maximizes the probability (likelihood) of the data?

  6. Maximum likelihood estimation (MLE) • Let’s find the normal distribution that best fits this data (i.e. maximizes the likelihood).

  7. Maximum likelihood estimation (MLE) • Likelihood for one value given by the density:

  8. Maximum likelihood estimation (MLE) • Likelihood for one value given by the density: • Likelihood of multiple independent values is the product of their individual likelihoods:

  9. Maximum likelihood estimation (MLE) • Goal: find 𝜇, 𝜎 that maximize • Strategy take log, then derivative, set to zero and solve:

  10. Maximum likelihood estimation (MLE) Math on board

  11. Maximum likelihood estimation (MLE) • Solution:

  12. Maximum likelihood estimation (MLE) Example in R

  13. Mixture models Question: Are professors taller than their students? Mean professor height is 68.7 Mean student height is 67.5

  14. Mixture models Is this difference purely due to the fact that professors skew male? Mean professor height is 68.7 Mean student height is 67.5

  15. Mixture models • Is this difference purely due to the fact that professors skew male? • Ideal: we measure gender to account for its effect • Otherwise: gender is a hidden variable that we have to model

  16. Mixture models • Mixture model: the data is made up of subpopulations, with a different distribution describing each sub-population. • n observations, k subpopulations

  17. Mixture models • Mixture model: the data is made up of subpopulations, with a different distribution describing each sub-population. • n observations, k subpopulations • are the observed data values

  18. Mixture models • Mixture model: the data is made up of subpopulations, with a different distribution describing each sub-population. • n observations, k subpopulations • are the observed data values • are the particular populations that each data point belongs to. (This is unknown) • is an unknown parameter we need to estimate

  19. Mixture models • Mixture model: the data is made up of subpopulations, with a different distribution describing each sub-population. • n observations, k subpopulations • are the observed data values • are the particular populations that each data point belongs to. (This is unknown) • are probability density functions that define the likelihood of a given data value for each subpopulation • , ,…, are unknown parameters that we want estimate.

  20. Mixture models • Professor/student height example. Let’s consider professor height first. • are the heights of each professor

  21. Mixture models • Professor/student height example. Let’s consider professor height first. • are the heights of each professor • are the genders (male/female) of each professor • is the fraction of professor who are male

  22. Mixture models • Professor/student height example. Let’s consider professor height first. • are the heights of each professor • are the genders (male/female) of each professor • is the fraction of professor who are male • is the distribution of heights for female professors • is the distribution of heights for male professors

  23. Mixture models • Professor/student height example. • is the distribution of heights for female professors • is the distribution of heights for male professors • If we assume that heights are distributed normally then:

  24. Expectation maximization How can we estimate parameters for the subpopulation distribution if we don’t know which subpopulation a value belongs to? Use the following steps: Start with a guess for the subpopulation parameter distributions Calculate the probability that each point is in each subpopulation Estimate the subpopulation parameters using the an average weighted by the probabilities calculated in step (2) Repeat steps 2+3 until converge

  25. Expectation maximization • The theory • EM is guaranteed to yield parameters that increase the likelihood of the data. • EM might converge to local maximum.

  26. Expectation maximization Professor/student height example 2. Calculate the probability that each point is in each subpopulation Use: Bayes Rule

  27. Expectation maximization Professor/student height example 2. Calculate the probability that each point is in each subpopulation

  28. Expectation maximization Professor/student height example 3. Estimate the subpopulation parameters using the average weighted by the probabilities calculated in step (2) Regular MLE Weighted MLE Probability that professor i is male

  29. Expectation maximization Professor/student height example 3. Estimate the subpopulation parameters using the average weighted by the probabilities calculated in step (2) Update

  30. Expectation maximization Example in R

  31. The exponential family Step (3) requires that we can estimate the subpopulation parameters using some kind of mean. What other distributions have MLEs that meet this requirement? The exponential family:

  32. The exponential family Normal distribution:

  33. The exponential family Math on board

  34. The exponential family Other distributions in the exponential family: • Exponential • Gamma • Chi-squared • Beta • Dirichlet • Bernoulli • Categorical • Poisson • Geometric • And more…

  35. Application of EM: Isoform expression Which isoforms of gene X are expression and at what level? Exon 1 Exon 2 Exon 3 Exon 4 Exon 1 Exon 3 Exon 1 Exon 2 Exon 4 Exon 1 Exon 4

  36. Application of EM: Isoform expression Data: Illumina RNA-seq GeneX.1 Exon 1 Exon 2 Exon 3 Exon 4 GeneX.2 Exon 1 Exon 3 GeneX.3 Exon 1 Exon 2 Exon 4 GeneX.4 Exon 1 Exon 4

  37. Mixture models • Describe this problem as a mixture model • n observations (reads), k subpopulations (isoforms) • are genomic alignments that tell us which isoform a read might belong to. • are the particular isoforms that each read is derived from • , ,…, are fraction of transcripts from each isoform.

  38. Mixture models Step (2): calculate posterior probability of

  39. Mixture models Step (2): calculate posterior probability of

  40. Mixture models Step (2): calculate posterior probability of Fraction of reads derived from isoform j The length of isoform j. (Longer isoforms have more genomic loci where a read can begin) 1 if read i is consistent with isoform j, 0 if it is not

  41. Expectation maximization (3) Estimate the parameters using an average weighted by the probabilities calculated in step (2) Regular MLE Weighted MLE

  42. Expectation maximization • (3) Estimate the parameters using an average weighted by the probabilities calculated in step (2) • Regular MLE Weighted MLE • EM for isoform expression: Start with an initial guess of expression, repeat these two steps until convergence.

More Related