1 / 28

Course Review (Part 1)

Course Review (Part 1). LING 572 Fei Xia 1/19/06. Outline. Recap Homework 1 Project Part 1. Recap. Recap. FSA and HMM DT, DL, TBL. A learning algorithm. Modeling: Representation Decomposition Parameters Properties Training: Simple counting, hill-climbing, greedy algorithm, …

lee-lewis
Télécharger la présentation

Course Review (Part 1)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Course Review (Part 1) LING 572 Fei Xia 1/19/06

  2. Outline • Recap • Homework 1 • Project Part 1

  3. Recap

  4. Recap • FSA and HMM • DT, DL, TBL

  5. A learning algorithm • Modeling: • Representation • Decomposition • Parameters • Properties • Training: • Simple counting, hill-climbing, greedy algorithm, … • Pruning and filtering • Smoothing issues

  6. A learning algorithm (cont) • Decoding: • Simply verify condition: DT, DL, TBL • Viterbi: FSA and HMM • Pruning during the search • Relation with other algorithms: • Ex: DNF, CNF, DT, DL and TBL • Ex: WFA and HMM, PFA and HMM

  7. NLP task • Choose a ML method: e.g., DT, TBL • Modeling: • Ex: TBL: What kinds of features? • Ex: HMM: What are the states? What are the output symbols? • Training: e.g., DT • Select a particular algorithm: ID3, C4.5 • Choose pruning/filtering/smoothing strategies, thresholds, quality measures, etc. • Decoding: • Pruning strategies

  8. Homework 1

  9. Hw1 • Problem 3 & 4: State-emission and arc-emission HMMs. • Problem 5: Viterbi algorithm • Problem 2: HMM • Problem 1: FSA

  10. Problem 3: State-emission HMM  Arc-emission HMM (a) (b) • Given a path X1, X2, ..., Xn+1 in HMM1 •  The path in HMM2 is X1, X2, ..., Xn+1.

  11. Problem 3 (cont) (c)

  12. Problem 4: Arc-emission HMM  state-emission HMM (a)

  13. Problem 4 (cont) (b) Given a path X1, X2, …., Xn+1 in HMM1, the path in HMM2 is X1_X1, X1_X2, …., Xn_Xn+1 (c)

  14. Problem 5: Viterbi algorithm with ε-emission

  15. Problem 5 (cont) Cost(i, j) is the max prob for a path from i to j which produces nothing. To calculate Cost(i, j), let where N is the number of states in HMM.

  16. Problems 1 & 2: Important tricks Constants can be moved outside the sum signs:

  17. Tricks (cont) The order of sums can be changed:

  18. Tricks (cont) • The order of sum and product:

  19. Problem 2: HMM • Prove by induction: • When the length is 0: • When the length is n-1, we assume that

  20. Problem 2 (cont)

  21. Problem 1: FSA

  22. f f f q0 q1 qN q1 qN q1 qN q2 q2 q2 Problem 1 (cont) ...

  23. Project Part 1

  24. Carmel: a WFA package WFA Input/output symbols Carmel best path

  25. tj: P(tj | ti) tj ti t/w: P(w | t) q Bigram tagging • FST1: Initial states: {BOS} Final states: {EOS} • FST2:

  26. t/w: P(w | t) q Trigram tagging t2: P(t2 | t1,t0) • FST1: Initial state: {BOS-BOS} Final state: {EOS-EOS} • FST2: t0t1 t1t2

  27. Minor details • BOS and EOS: • No need for special treatment for BOS • EOS: • Add two “EOS”s at the end of a sentence, or • Replace input symbol “EOS” with ε (a.k.a. *e*).

  28. Results

More Related