Enhancing Business Identification through Spatial Detection and Scene Text Recognition
This research addresses the weaknesses in current approaches to business name matching and spatial detection, focusing on improvements in geocoding, image recognition, and text detection. Key findings include the low availability of storefront images and challenges in template and image matching. The study proposes a shift towards Scene Text Recognition (STR), employing multi-resolution character detection and analysis of geometry and color properties for improved text extraction. Future work involves refining the text detector and completing the STR implementation.
Enhancing Business Identification through Spatial Detection and Scene Text Recognition
E N D
Presentation Transcript
Business Identification:Spatial Detection Alexander Darino Weeks 7 & 8 (Abridged)
Weaknesses to Current Approach Business Name Matching Business Spatial Detection Latitude Longitude Geocoding Reverse Geocoding Nearby Businesses BusinessIdentification Image OCR Detected Text
Alternative: Image Matching • Weaknesses: • Low Availability of Storefront Images (< 50% Avg) • George Aiken area businesses with photos: 18/35 • Brueggers area businesses with photos: 22/40 • Tambellini area businesses with photos: 8/22 • Available Images too small (100 x 100) • Not a viable solution
Alternative: Template Matching • Tambellini • Tambellini • Tambellini • Tambellini • Tambellini • Tambellini • Tambellini • Tambellini
Alternative: Template Matching • SIFT is not a robust solution. • Maybe Haar features will work? • Moving right along…
Moving away from SIFT and revisiting Scene Text Recognition
STR Implementation • STR Implementation: “Automatic Detection and Recognition of Signs From Natural Scenes” Multiresolution-based potential characters detection Character/layout geometry and color properties analysis Refined Detection Local affine rectification
Multiresolution-based potential characters detection • Laplacian-of-Guassian Edge Detection • Dice image/edges into Patches • Combine patches with similar properties into regions • Obtain bounding box of region as candidate text • Properties include: • Mean • Variance • Intensity(?)
Multiresolution-based potential characters detection Patches qualify if:
Color Properties Analysis • Implemented Gaussian Mixture Model (GMM) to obtain μ and σ of foreground/background for: R/G/B/H/I • Calculated Confidences that component (RGBHI) can be used to recognize characters Multiresolution-based potential characters detection Character/layout geometry and color properties analysis Refined Detection Local affine rectification
Evaluation • The highest confidence was found in Intensity even though most letters vanish, vs Hue where letters are easily distinguisible • This suggests text recognition should occur individually per character • The paper further suggests it needs the patches around the individual characters • (Woops)
Next Steps • Goal: Finish STR by next Friday • Fix text detector • Work with Amir over weekend to implement remaining STR algorithms