1 / 86

Conditional Random Fields

Conditional Random Fields. Presented by Shira Kritchman & Lena Gorelick May 13, 2007. Advanced Topics in Computer and Human Vision Spring 2007. Outline. Introduction Statistical Modeling Generative vs. Discriminative Models Naïve Bayes vs. Logistic Regression Sequence Modeling: HMM

selene
Télécharger la présentation

Conditional Random Fields

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Conditional Random Fields Presented by Shira Kritchman & Lena Gorelick May 13, 2007 Advanced Topics in Computer and Human Vision Spring 2007

  2. Outline • Introduction • Statistical Modeling • Generative vs. Discriminative Models • Naïve Bayes vs. Logistic Regression • Sequence Modeling: HMM • CRF • Sequence Modeling: Linear Chain CRF • Learning (Parameter Estimation) • Improved Iterative Scaling (IIS) • S algorithm • General CRF • Applications

  3. Problem Formulation – Label Assignment • Classification • Segmentation Horse Horse Non-horse

  4. Problem Formulation – Label Assignment Labels Observation Labels Horse Horse Non-horse

  5. Problem Formulation – Label Assignment • Classification Biology Math

  6. Problem Formulation – Label Assignment • Parsing TAKE THE GREEN APPLE FROM THE BOX THEN HIT IT WITH MY SWORD verb article adjective noun preposition article noun conjunction verb pronoun preposition adjective noun

  7. Problem Formulation • Find the Mechanism = LEARNING ? Prior Knowledge Data – Horses are rarely blue. Horses have 4 legs. … Neighboring pixels have similar labels.

  8. Outline • Introduction • Statistical Modeling • Generative vs. Discriminative Models • Naïve Bayes vs. Logistic Regression • Sequence Modeling: HMM • CRF • Sequence Modeling: Linear Chain CRF • Learning (Parameter Estimation) • Improved Iterative Scaling (IIS) • S algorithm • General CRF • Applications

  9. Generative Modeling • Define a joint probability distributionover observation and label pairs • Label Assignment: Bayes

  10. What does it generate? • Assumption: the “outputs” probabilistically generate the “inputs” • So we use

  11. Generative Modeling • What are the candidate distributions for ? • Too simple • Underfitting • Too sparse • Overfitting

  12. Data Prior Knowledge Horses are rarely blue. Horses have 4 legs. … Neighboring pixels have similar labels. Generative Modeling – Model Family • Define a model family ?

  13. Generative Modeling – Likelihood • We look for that maximizes the likelihood Data - Training Data !

  14. Generative Model – i.i.d. Framework Data -

  15. Outline • Introduction • Statistical Modeling • Generative vs. Discriminative Models • Naïve Bayes vs. Logistic Regression • Sequence Modeling: HMM • CRF • Sequence Modeling: Linear Chain CRF • Learning (Parameter Estimation) • Improved Iterative Scaling (IIS) • S algorithm • General CRF • Applications

  16. Discriminative Modeling • Directly defines a conditional probability distribution over the labels given the observation • Label Assignment:

  17. Discriminative Modeling • Does not include a model of • Which is not needed for label assignment anyway!

  18. Discriminative Modeling – Model Family • Define a model family ?

  19. Discriminative Modeling – Likelihood • We look for that maximizes the conditional likelihood Data !

  20. Discriminative Model – i.i.d. Framework Data

  21. Generative vs. Discriminative

  22. Generative vs. Discriminative

  23. What is Conditional Likelihood? not required for the labeling task Conditional Likelihood!

  24. Generative Model – Example • Naïve Bayes – Horse Horse Non-horse

  25. Discriminative Model – Example • Logistic Regression –

  26. Can have complex dependencies among are independent given Generative vs. Discriminative • Discriminative Model is better suited to contain rich overlapping features

  27. Generative vs. Discriminative • Model relation between (age, weight, blood preasure) will suffer from a heart attack soon (binary) • Natural to model • Unnatural to model

  28. Models Strict independence assumptions on the observations Models Allows arbitrary, inter-dependent features on the observation Does not spend effort on modeling Generative vs. Discriminative

  29. Outline • Introduction • Statistical Modeling • Generative vs. Discriminative Models • Naïve Bayes vs. Logistic Regression • Sequence Modeling: HMM • CRF • Sequence Modeling: Linear Chain CRF • Learning (Parameter Estimation) • Improved Iterative Scaling (IIS) • S algorithm • General CRF • Applications

  30. Classifiers and Graphical Models • and predict a single variable • What about predicting many variables that are interdependent? • Use a graphical model

  31. Sequence Models – HMM • Simple graphical models Hidden states Observable variables

  32. Sequence Models – HMM • Parsing verb article adjective noun preposition article noun conjunction verb pronoun preposition adjective noun TAKE THE GREEN APPLE FROM THE BOX THEN HIT IT WITH MY SWORD

  33. HMM – Exponential Form • Rewrite as features noun verb noun apple

  34. Outline • Introduction • Statistical Modeling • Generative vs. Discriminative Models • Naïve Bayes vs. Logistic Regression • Sequence Modeling: HMM • CRF • Sequence Modeling: Linear Chain CRF • Learning (Parameter Estimation) • Improved Iterative Scaling (IIS) • S algorithm • General CRF • Applications

  35. From HMM to CRF • The underlying conditional distribution: Partition function per observation

  36. From HMM to CRF • We can now use richer features of the observation for the same price!

  37. Linear-Chain CRF – Definition • random vectors • a parameter vector • real-valued functions  Linear-Chain CRF is HMM

  38. Outline • Introduction • Statistical Modeling • Generative vs. Discriminative Models • Naïve Bayes vs. Logistic Regression • Sequence Modeling: HMM • CRF • Sequence Modeling: Linear Chain CRF • Learning (Parameter Estimation) • Improved Iterative Scaling (IIS) • S algorithm • General CRF • Applications

  39. Parameter Estimation – Maximum Likelihood • Maximize the conditional log likelihood: Concave!!! Global Maximum!!!

  40. Parameter Estimation – Maximum Likelihood • Take partial derivatives w.r.t. • There is no closed form solution, since are coupled. Any Alternatives? Model expectation Empirical mean Detour

  41. We assumed: We maximized conditional likelihood: We got: We assume: We maximize conditional entropy: We get: Maximum Likelihood – Maximum Entropy We get the same distribution

  42. Parameter Estimation - Finding • Given current parameter estimation • Find a new set of parameters s.t., • Repeat until convergence Gain in likelihood

  43. Parameter Estimation - Finding • Bound with auxiliary function • Maximize w.r.t. • Update

  44. Parameter Estimation - Finding • Improved Iterative Scaling Algorithm (IIS): • Start with some (arbitrary) value for each • Repeat until convergence: • Solve for • Set

  45. Parameter Estimation: IIS – S algorithm • Differentiating w.r.t. gives • Note that if Total feature count

  46. Parameter Estimation: IIS – S algorithm • Define a new slack feature • And we have an additional constraint for

  47. Parameter Estimation: IIS – S algorithm For each need to compute marginals at every iteration! Local in

  48. Computing Marginals with BP • Computing marginals in Linear-Chain CRF • efficient and exact BP • IIS algorithm # optimization steps # in training

  49. Parameter Estimation: IIS(S) – Summary • Closed form solution • Converges to global maximum • is proportional to the length of • Small optimization steps for large • T algorithm

  50. Outline • Introduction • Statistical Modeling • Generative vs. Discriminative Models • Naïve Bayes vs. Logistic Regression • Sequence Modeling: HMM • CRF • Sequence Modeling: Linear Chain CRF • Learning (Parameter Estimation) • Improved Iterative Scaling (IIS) • S algorithm • General CRF • Applications

More Related