170 likes | 327 Vues
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 5 (17/01/06) Prof. Pushpak Bhattacharyya IIT Bombay. Classical Part of Speech (PoS) Tagging. Approach to Classical PoS Tagging. Lexicon labeling Look at the dictionary
E N D
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 5 (17/01/06)Prof. Pushpak BhattacharyyaIIT Bombay Classical Part of Speech (PoS) Tagging
Prof. Pushpak Bhattacharyya, IIT Bombay Approach to Classical PoS Tagging • Lexicon labeling • Look at the dictionary • Obtain all tags for the words in the sentence • Plug them as labels for these words • Disambiguation • Use rules, to eliminate tags • Repeat disambiguation process until • all the tags are disambiguated, or • no further change occurs.
Prof. Pushpak Bhattacharyya, IIT Bombay Possible Tag’s Example • Possible Tags for ‘that’ • DET (Determiner) • PRON (Pronoun) • ADV (Adverb) • COMPLIMENTIZER
Prof. Pushpak Bhattacharyya, IIT Bombay Usage Examples of ‘that’ • ‘That’ as DET • Look at that man. • ‘That’ as PRON • That will never be understood. • ‘That’ as ADV • They have spent that much! • ‘That’ as COMPLIMENTIZER • She tells me that she is fine.
Prof. Pushpak Bhattacharyya, IIT Bombay A Disambiguation Rule • Given input ‘that’: If ( +1 A / ADV / QUANT) ( +2 SENT_LIM) ( NOT -1 SVOC / A) Then eliminate non-ADV tags Else eliminate ADV tag
Prof. Pushpak Bhattacharyya, IIT Bombay Semantics of the Rule • Conditions are associated through ‘ANDing’ • Condition is read as: • Next word is Adjective, Adverb, or Quantifier, AND • Second followed word is a Sentence Limiter, AND • Previous word is not a ‘consider’ type of word
Prof. Pushpak Bhattacharyya, IIT Bombay Apply the Disambiguation Rule • Sentence 1 , 2, and 4 does not satisfy the conditions given in the rule, • Sentence 3 does satisfy the conditions, viz • QUANT = ‘must’ • SENT_LIM = ‘!’ • SVOC = ‘spent’
Prof. Pushpak Bhattacharyya, IIT Bombay How to obtain Attributes and Rules • We necessitate: • Lexical Attributes • Disambiguation rules • Both can be obtained by: • Manual means • Learning • It is aneasy process for “Lexical attributes”, • It is not trivial for the “Disambiguation rules”.
Prof. Pushpak Bhattacharyya, IIT Bombay Specification for Rule learning • Rules have to be compact, i.e. each condition should be as specific as possible • A rule should cover lot of phenomena. • Rules have to be non-conflicting.
Prof. Pushpak Bhattacharyya, IIT Bombay Brill’s Tagger • Learns rules from algorithm called as “Transformation based error driven learning” • Uses AI search technique, viz, • Starts with “state space” • Use an algorithm (BFS, DFS, A*, et. al.) for searching the space
Prof. Pushpak Bhattacharyya, IIT Bombay Brill’s tagging as search • S0: Seed tagged text • S1,S2: Generated states • O1,O2: Operators (Rules) S0 O1 O2 S1 S2 • Operators have LHS as condition and RHS as actions, • Generated states are obtained on performing the actions
Prof. Pushpak Bhattacharyya, IIT Bombay Learning using Templates • Brill’s learning uses Templates. • Templates are instantiated based on training situation. • Steps in learning: • Look at the training corpus • Instantiate the templates • Arrive at a set of rules satisfying the performance criteria
Prof. Pushpak Bhattacharyya, IIT Bombay An Example Template • Change tag ‘a’ to tag ‘b’ when the preceding (following) word is tagged ‘z’ the word two before (after) is tagged‘z’ one of the two preceding (following) words is tagged ‘z’ one of the three preceding (following) words is tagged ‘z’ the preceding word is tagged‘z’ and the following word is tagged ‘w’ the preceding (following) word is tagged ‘z’and the word two before (after) is tagged ‘w’
Prof. Pushpak Bhattacharyya, IIT Bombay Brill’s Tagger illustration • Example: • They consider that odd. • Tagged correctly as: • They_PPS consider_VB that_ADV odd_JJ. • Now next step is to learn rules and decide which template to instantiate.
Prof. Pushpak Bhattacharyya, IIT Bombay a: 0.1 a: 0.2 a: 0.4 a 0.2 S0 S0 b: 0.5 b: 0.3 b: 0.1 b: 0.2 Viterbi Algorithm illustration • Consider this state machine for input sequence: aabb • Next slide explains the search steps to get the maximum product value of the probabilities, for the state sequences.
Prof. Pushpak Bhattacharyya, IIT Bombay S0 a S0S0 (0.2) S0S1 (0.2) a S0S1S1 (0.08) S0S0S0 (0.04) S0S0S1 (0.04) S0S1S0 (0.02) x x b S0S0S0S1 (0.02) S0S1S1S0 (0.016) x S0S1S1S1 (0.024) x S0S1S1S0S0 (0.016) S0S1S1S0S1 (0.008) b x S0S1S1S1S0 (0.0048) x S0S1S1S1S1 (0.0072) S0S0S0S0 (0.004) x
Prof. Pushpak Bhattacharyya, IIT Bombay Remarks • It prunes the node if there is another node ending with the same state and higher product value – Markov Assumption • Complexity: • Without Markov Assumption: 2T(exponential) • With Markov Assumption: 2T (linear)