1 / 15

OPTICAL CHARACTER RECOGNITION

यदा यदा हि धर्मस्य ग्लानिर्भवति भारत । अभ्युत्थानमधर्मस्य तदात्मानं सृजाम्यहम् ॥७॥ परित्राणाय साधूनां विनाशाय च दुष्कृताम् । धर्मसंस्थापनार्थाय सम्भवामि युगे युगे ॥८॥. OPTICAL CHARACTER RECOGNITION. THE BASICS…. The definition is clear from the name itself.

Télécharger la présentation

OPTICAL CHARACTER RECOGNITION

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. यदा यदा हि धर्मस्य ग्लानिर्भवति भारत ।अभ्युत्थानमधर्मस्य तदात्मानं सृजाम्यहम् ॥७॥परित्राणाय साधूनां विनाशाय च दुष्कृताम् ।धर्मसंस्थापनार्थाय सम्भवामि युगे युगे ॥८॥

  2. OPTICAL CHARACTER RECOGNITION

  3. THE BASICS… The definition is clear from the name itself. “Re”- “Cognition” of “Optical Characters”. This means that the “Optical Characters” have been “Cognized” earlier and we are just doing it again.

  4. As OCR is a form of re-cognition, the standard approach towards its implementation is to have certain number of characters in memory and then to co-relate them with another character that the OCR system is given as input. All new researches in this sphere are focused at two points. The First is using different data-structures for storage of the characters in memory and The Second is using various algorithms for co-relating the “Optical Character” input with these data types.

  5. NATURE ALWAYS INSPIRES US… We humans ourselves are the best machines for OCR. The visual cortex through its neural networks has done the best job in character recognition. It is no surprise that Convolutional Neural Networks have also been found to give the best results in OCR.

  6. SACCADES The eye performs rapid scanning of the target data in form of fast movements through certain keypoints. This helps to capture data in a resolution that best captures the variance of the sample. These movements are known as SACCADES.

  7. We will depict a method for OCR, aiming at Handwritten Digit Recognition below This method trains neural networks for OCR. In this supervised training we give various images which contain or do not contain a specific character, as input to the neural network. The difference here is that we do not have just one neural network, but many neural networks for different resolutions of the input images.

  8. With the help of Saccades and foveal vision, we are able to perceive the world at different levels of resolution. We can zero-in on certain important points in an image and cut the crap in whatever we see. However this extreme level of non-uniformity could not be introduced in the method designed by us. Hence we have included images with different resolutions but having uniform resolution.

  9. For an input training data (Image) I ,we give different resolution images of I as input to different neural networks. Considering I to be 1x1 , if we wish to give q number of resolutions along X-axis and q number of resolutions on Y-axis we get in total q2 number of resolution. As an example 3 different resolutions on each axis will mean 1x1 image gets converted to 0.33x0.33;0.33x0.67;0.33x1;0.67x0.33;0.67x0.67;0.67x1;1x0.33;1x0.67;1x1. Thus we get 32 =9 images as input to 9 different neural networks each handling a particular resolution.

  10. Resolution Constellation & Demons • On the right we show the resolutions (red circles) for which we train separate networks. • It can be observed that we have tried to “spread” the resolutions evenly in the square between (0.33,0.33) and (1,1). • When it comes to the Cognitive Demon Theory, we can easily see the analogy. A demon sits in each point in the plane, responding to different resolutions.

  11. Decision Demon ? • So, we continue with this framework and assume we have trained the 9 networks separately on different sets of the same training data (7000 Images) differing only in resolution. • Now, if we are given an image to test it on, How do we engineer this architecture to give a final decision. • Here comes the Decision Demon, which looks at the outputs of all the 9 demons and models a final output. Here output obviously refers to the set of probabilities generated by each network (10; as we have 10 digits). • Finally, the decision demon uses a Bayesian probability estimation from the 9 sets of probabilities, as follows

  12. Different Re-sizing • Apart from the normal Bi-cubic interpolation used In resizing functions, we introduce a small change. • We do square weighted resizing, which as shown to the right, returns a better output (a) after resizing compared to the old (b). • So it is clear from the picture, that recognizing the distorted “1”, is easier for image (a) than for image (b).

  13. Results- A 1% increase • As our method was initially aimed at capturing the “improvement” in performance when instead of 1 we have 9 different networks. We give the results as follows : • When only 1 network was used, we obtained an accuracy of 96.03 %. • When we used our method, the change in accuracy was near to 1%. – 96.97% • The figure to the right shows some of the examples which were misclassified.

  14. Some points • Although this method tries to take motivation from Saccades, it still is largely different from what we humans do when it comes to visual understanding. • The Architecture of our visual cortex is very feedback oriented, with a clear hierarchy as it moves further towards our brain : Giving the notion that the level of understanding increases manifold with each level. • It is this internal representation of things in each sensory structure, which still renders Neuroscientists clueless to this day.

  15. Thank. You.

More Related