1 / 47

Sketch Recognition Algorithms and Paper Review: Sutherland and Rubine Methods

Explore the analysis, implementation, and comparison of sketch recognition algorithms, including Sutherland and Rubine methods. Review the paper by Sutherland and discuss the benefits and drawbacks of the Rubine method. Learn how to compute features and average feature values for each gesture in an object-oriented manner.

ratcliffeb
Télécharger la présentation

Sketch Recognition Algorithms and Paper Review: Sutherland and Rubine Methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 2: SKETCH RECOGNITION Analysis, implementation, and comparison of sketch recognition algorithms, including feature-based, vision-based, geometry-based, and timing-based recognition algorithms; examination of methods to combine results from various algorithms to improve recognition using AI techniques, such as graphical models. Learn how to make your drawings come alive…

  2. Paper Review: Sutherland • Thoughts?

  3. Sutherland • Amazing program • Beginning of CAD design (simulations) • Beginning of Object Oriented Programming (instances) • Beginning of Graphics (blinking) • Graphical Constraint Satisfaction • Zoom

  4. Sutherland • Not quite like sketching on paper • Perhaps better in some ways • Many buttons • Would like to remove the buttons • Really an endpoint placing system

  5. Sutherland • Not doing recognition….

  6. Sutherland • “The user signals that he is finished drawing by flicking the pen too fast for the tracking program to follow”

  7. Sutherland • Marking of Display File • 20 bits to coordinates • 16 bits for label • What about intersection point? If shape is moved, does a whole develop at the intersection point?

  8. Sutherland • Zigzag: 45 minutes to create, 15 minutes to plot • Clock: 20 minutes to create (fewer constraints?)

  9. Sutherland • Intuitiveness of constraint labels

  10. Rubine Method • 1991 • Foundation work in sketch recognition • Used and cited widely • Works well • 15 samples adequate

  11. Benefits • A lot of power from single stroke • Simple • Works with only 15 training examples • Avoids segmentation problems • Robust recognition (when used as intended)

  12. Drawbacks • Gestures must be drawn using a single stroke (ways to get past this) • Gestures must be drawn the same way every time (harder to get past this) • Requires training examples • But more importantly – the user has to be trained

  13. Overview • Stroke = a series of x,y,time values • Class of shapes = gesture type (e.g., equals sign is one gesture) • Several examples for each gesture • Compute several features for each example • Classify new gestures to the closest class.

  14. Stroke • P = total number of points • p = middle point • First point (x0,y0) • Last point (xP-1,yP-1) • let’s use (xn,yn) • Compute xmin, ymin, xmax, ymax

  15. Feature f1 f1 = cos(α) = (x2-x0)/√[(y2-y0)2 + (x2-x0)2] • Cosine of starting angle

  16. Feature f2 f2 = sin(α) = (y2-y0)/√[(y2-y0)2 + (x2-x0)2] • Sine of starting angle

  17. Feature f3 f3 = √[ (ymax-ymin)2 + (xmax-xmin)2] • Length of diagonal of bounding box (gives an idea of the size of the bounding box)

  18. Feature f4 f4 = arctan[(ymax-ymin)/(xmax-xmin)] • Angle of diagonal • gives an idea of the shape of the bounding box (long, tall, square)

  19. Feature f5 f5 = √[(xn-x0)2 + (yn-y0)2] • Distance from start to end

  20. Feature f6 f6 = cos(β) = (xn – x0)/f5 • Cosine of ending angle

  21. Feature f7 f7 = sin(β) = (yn – y0)/f5 • Sine of ending angle (B)

  22. Feature f8 • Total stroke length

  23. Change in Rotation • Arctan gives you the directional angle (i.e., in 360 or rather 2π)

  24. Feature f9 • Total rotation (from start to end point) • (not the same as β-α – think of spirals)

  25. Feature f10 • Absolute rotation • How much does it move around

  26. Feature f11 • Rotation squared • How smooth are the turns? • Measure of sharpness

  27. Timing Data Let Δtp = tp+1 - tp

  28. Feature f12 • The maximum speed reached (squared)

  29. Feature f13 • Total time of stroke

  30. Features • What do you think about the features? • Can you think of any problems?

  31. Collect E examples of each gesture • Calculate the feature vector for each example • Fcei = the feature value of the ith feature for the eth example of the cth gesture

  32. Find average feature values for gesture • For each gesture, compute the average feature value for each feature • Fci is the average value for the ith feature for the cth gesture

  33. Homework • Read Rubine Paper • Write summary and discussion on blog • Create functions to compute the features and the average feature value for each gesture (not the classification yet) • Object oriented way – several of the features are valuable in other classification schemes as well. • Test on the math data. • Bring in your feature values to class.

  34. Implementation Issues • Tablets are faster now than they were • Rubine paper was based on slower data (mouse?) • But, for correct feel, pen needs to be fast • Issues you may have: • Duplicate location (2 consecutive points in same place) • Duplicate time (2 consecutive points at the same time) • Divide by zero (because of above problems) • Make sure your to convert to double before dividing in Java • Remove the second point not the first for duplicate points

  35. Rubine Classification • Evaluate each gesture 0 <= c <= C. • Vc = value = goodness of fit for that gesture c. • Pick the largest Vc , and return gesture c

  36. Rubine Classification • Wc0 = initial weight of gesture • Wci = weight for the I’th feature • Fi = ith feature value • Sum the features together

  37. Compute gesture covariance matrix • How are the features of the shape related to each other? • Look at one example - look at two features – how much does each feature differ from the mean – take the average for all examples – that is one spot in the matrix • http://mathworld.wolfram.com/Covariance.html • Is there a dependency (umbrellas/raining)

  38. Normalize • cov(X) or cov(X,Y) normalizes by N-1, if N>1, where N is the number of observations. This makes cov(X) the best unbiased estimate of the covariance matrix if the observations are from a normal distribution.For N=1, cov normalizes by N • They don’t normalize for ease of next step (so just sum, not average)

  39. Normalization • Taking the average • But… we want to find the true variance. • Note that our sample mean is not exactly the true mean. • By definition, our data is closer to the sample mean than the true mean • Thus the numerator is too small • So we reduce the denominator to compensate

  40. Common Covariance Matrix • How are the features related between all the examples? • Top = non normalize total covariance • Bottom = normalization factor = total number of examples – total number of shapes

  41. Weights • Wcj = weight for the jth feature of the cth shape • Sum for each feature • Common Covariance Matrix inverted* ij • Average feature value for the ith feature for the cth gesture

  42. Initial Weight • Initial gesture weight = • Sum for each feature in class: • Feature weight * average feature value

  43. Rubine Classification • Evaluate each gesture 0 <= c <= C. • Vc = value = goodness of fit for that gesture c. • Pick the largest Vc , and return gesture c

  44. Rubine Classification • Wc0 = initial weight of gesture • Wci = weight for the I’th feature • Fi = ith feature value • Sum the features together

  45. Eliminate Jiggle • Any input point within 3 pixels of the previous point is discarded

  46. Rejection Technique 1 • If the top two gestures are near to each other, reject. • Vi > Vj for all j != i • Reject if less than .95

  47. Rejection Technique 2 • Mahalanobis distance • Number of standard deviations g is from the mean of its chosen class i.

More Related