Large Vocabulary Unconstrained Handwriting Recognition

Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center

Pen Technologies • Pen-based interfaces in mobile computing

Mathematical Formulation • H : Handwriting evidence on the basis of which a recognizer will make its decision • H = {h1, h2, h3, h4,…,hm} • W : Word string from a large vocabulary • W = {w1, w2, w3, w4,…., wn} • Recognizer :

Mathematical Formulation CHANNEL SOURCE

Source Channel Model CHANNEL FEATURE EXTRACTOR WRITER DIGITIZER H DECODER

Source Channel Model Handwriting Modeling : HMMs Language Modeling SEARCH STRATEGY

Hidden Markov Models Memoryless Model Add Memory Hide Something Mixture Model Markov Model Add Memory Hide Something Hidden Markov Model Alan B Poritz : Hidden Markov Models : A Guided Tour ICASSP 1988

Memoryless Model COIN : Heads (1) : probability p Tails (0) : probability 1-p Flip the coin 10 times (IID Random sequence) Sequence : 1 0 1 0 0 0 1 1 1 1 Probability = p*(1-p)*p*(1-p)*(1-p)*(1-p)*p*p*p*p =

Add Memory – Markov Model 2 Coins : COIN 1 => p(1) = 0.9, p(0) = 0.1 COIN 2 => p(1) = 0.1, p(0) = 0.9 Experiment : Flip COIN 1, Note the outcome If ( outcome = Head) Flip Coin 1 Else Flip Coin 2 End Sequence 110 0 : Probability = 0.9*0.9*0.1*0.9 Sequence 1010 : Probability = 0.9*0.1*0.1*0.1

State Sequence Representation 1 : 0.9 0 : 0.9 0 : 0.1 1 2 1 : 0.1 Observed Output Sequence  Unique State Sequence

Hide the states => Hidden Markov Model 0.9 0.9 0.9 0.1 0.1 0.9 0.9 0.1 0.1 s1 s2 0.1 0.1 0.9

Why use Hidden Markov Models Instead of Non-hidden? • Hidden Markov Models can be smaller – less parameters to estimate • States may be truly hidden • Position of the hand • Positions of articulators

Summary of HMM Basics • We are interested in assigning probabilities p(H) to feature sequences • Memoryless model • This model has no memory of the past • Markov noticed that is some sequences the future depends on the past. He introduced the concept of a STATE – a equivalence class of the past that influences the future • Hide the states : HMM

Hidden Markov Models • Given a observed sequence H • Compute p(H) for decoding • Find the most likely state sequence for a given Markov model (Viterbi algorithm) • Estimate the parameters of the Markov source (training)

Computep(H) p(a) p(b) 0.5 0.4 0.8 0.2 0.5 0.5 0.3 0.7 0.7 0.3 0.3 0.5 s1 s2 s3 0.2 0.1

Computep(H) – contd. • Compute p(H) where H = a a b b • Enumerate all ways of producing h1=a 0.5x0.8 s1 s1 0.40 0.3x0.7 s2 0.21 0.2 0.2 s2 s2 0.04 0.4x0.5 s2 s3 0.03 0.5x0.3

Computep(H) – contd. • Enumerate all ways of producing h1=a h2=a 0.5x0.8 s1 0.5x0.8 s1 s1 0.3x0.7 s2 0.2 0.3x0.7 s2 s2 s2 0.2 0.2 0.4x0.5 s2 s3 0.2 s2 s2 0.5x0.3 0.4x0.5 0.4x0.5 s2 s3 s2 0.5x0.3 0.5x0.3 s3

Computep(H) • Can save computation by combining paths s1 s1 s1 s2 s2 s2 s2 s3 s2 s2 s3 s2 s3

.5x.3 .3x.7 .5x.8 .5x.2 .3x.3 .5x.7 .3x.7 .5x.8 .5x.3 .5x.7 .3x.3 .5x.2 Computep(H) • Trellis Diagram 0 a aa aab aabb s1 .2 .2 .2 .2 .2 s2 .4x.5 .4x.5 .4x.5 .4x.5 .1 .1 .1 .1 .1 s3

Basic Recursion • Prob (Node) = sum (Prob(predecessor) x Prob (predecessor->node) ) • Boundary condition : Prob (s, 0) = 1 0 a aa aab aabb s1, a : 0.4 s1, a : 0.4 s1, a : 0.4 s1, a : 0.4 1.0 s1 1.0 0.4 .16 .016 .0016 s1, 0 : .08 s1, a : .21 s2, a : .04 s1, 0 : .032 s1, a : .084 s2, a : .066 s1, 0 : .0032 s1, b : .0144 s2, b : .0364 s1, 0 : .00032 s1, b : .00144 s2, b : .0108 s1, 0 : 0.2 s2 0.2 0.33 .182 .054 .01256 s2, 0 : .033 s1, a : .03 s2, 0 : .0182 s2, a : .0495 s2, 0 : .0054 s2, b : .0637 s2, 0 : .001256 s2, b : .0189 s2, 0 : 0.02 s3 0.02 0.063 .0677 .0691 .020156

More Formally –Forward Algorithm

Find Most Likely Path for aabb- Dynamic Prog. or Viterbi • Max Prob (Node) = • MAX(Max(predecessor) x Prob (predecessor->node) ) 0 a aa aab aabb s1 s1,b : .0016 s1, a : 0.4 s1, a : .16 s1, b : .016 1.0 s1, 0 : .0032 s1, b : .0144 s2, b : .0168 s1, 0 : .08 s1, a : .21 s2, a : .04 s1, 0 : .032 s1, a : .084 s2, a : .066 s1, 0 : .00032 s1, b : .00144 s2, b : .00336 s2 s1, 0 : 0.2 s2, 0 : .021 s1, a : .03 s2, 0 : .0084 s2, a : .0315 s2, 0 :.00168 s2, b : .0294 s2, 0 : .000336 s2, b : .00588 s2, 0 : 0.02 s3

Training HMM parameters 1/3 1/2 p(a) p(b) 1/2 1/2 = H = abaa 1/3 1/2 1/3 .000385 .000578 .000868 .001157 .002604 .001736 .001302 p(H) = .008632

Training HMM parameters = A posterior probability of path i = .045 .067 .134 .100 .201 .150 .301

Training HMM parameters

Training HMM parameters .46 .60 .64 .36 .71 .29 .68 .32 .40 .34 .60 .40 .20 0.00108 0.00129 0.00404 0.00212 0.00253 0.00791 0.00537 Keep on repeating : 600 iterations : p(H) = .037037037 Another initial parameter set : p(H) = 0.0625

Training HMM parameters • Converges to local maximum • There are 7 (atleast) local maxima • Final solution depends on starting point • Speed of convergence depends on starting point

Training HMM parameters : Forward Backward algorithm • Improves on enumerating algorithm by using the Trellis • Results in reduction from exponential computation to linear computation

ForwardBackwardAlgorithm j . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Forward Backward Algorithm • = Probability that hj is produced by and the complete output is H = = Probability of being in state and producing the output h1, .. hj-1 = Probability of being in state and producing the output hj+1,..hm

Forward Backward Algorithm Transition count

Training HMM parameters • Guess initial values for all parameters • Compute forward and backward pass probabilities • Compute counts • Re-estimate probabilities BAUM-WELCH, BAUM-EAGON, FORWARD-BACKWARD, E-M

Large Vocabulary Unconstrained Handwriting Recognition

Large Vocabulary Unconstrained Handwriting Recognition

Presentation Transcript

Shorthand Handwriting Recognition for Pen-Centric Interfaces

Pen-Centric Shorthand Handwriting Recognition Interfaces

A Tutorial on Pronunciation Modeling for Large Vocabulary Speech Recognition

On-Line Handwriting Recognition

Large Vocabulary Continuous Speech Recognition (LVCSR)

Building a Handwriting recognition application with WPF

DIGITAL SIGNAL PROCESSING ARCHITECTURE FOR LARGE VOCABULARY SPEECH RECOGNITION

Syllables and Concepts in Large Vocabulary Continuous Speech Recognition

Online Arabic Handwriting Recognition

TANDEM ACOUSTIC MODELING IN LARGE-VOCABULARY RECOGNITION

Neural Network based Handwriting Recognition

Shorthand Handwriting Recognition for Pen-Centric Interfaces

Handwriting Recognition

Boosting HMM acoustic models in large vocabulary speech recognition

Hidden Markov Modelling and Handwriting Recognition

Features for handwriting recognition

Applications of Large Vocabulary Continuous Speech Recognition for Fatigue Detection

Handwriting Recognition

Online Arabic Handwriting Recognition

Applications of Large Vocabulary Continuous Speech Recognition for Fatigue Detection