1 / 14

Optical Character Recognition

Lecture 8. Optical Character Recognition. Qurat-ul-Ain ( Ainie ) Akram Sarmad Hussain Center for language Engineering Al- Khawarizmi Institute of Computer Science University of Engineering and Technology, Lahore, Pakistan. Syllable String Creation using lookup table.

Télécharger la présentation

Optical Character Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 8 Optical Character Recognition Qurat-ul-Ain (Ainie) Akram Sarmad Hussain Center for language Engineering Al-Khawarizmi Institute of Computer Science University of Engineering and Technology, Lahore, Pakistan

  2. Syllable String Creation using lookup table ISSALE 2014

  3. Project Presentation • Front Page • Optical Character Recognition(in English) • Optical Character Recognition(in Your Language) • Document Image • Output of OCR (Recognized Syllable Strings of OCR) • Syllable String Recognition Accuracy(Syllables /Total Syllables*100) • Group Members Name ISSALE 2014

  4. Preprocessing • Line Segmentation • Samples of line segmentation • Line segmentation accuracy results • Samples of incorrect line segmentation • Syllable/Ligature Segmentation • Samples of Syllable/Ligature segmentation • Syllable/Ligature Segmentation Accuracy Results • Samples of incorrect Syllable/Ligature segmentation ISSALE 2014

  5. Pre-processing • Main body and diacritics disambiguation ISSALE 2014

  6. Classification and Recognition • Data Description • 15 Main body Types (DataSet-1) • Training Data (35 Tokens) • Testing Data (15 Tokens) • Image samples • Document Images(DataSet-2) • Testing Data • X Tokens of Y main body Types • X Tokens of Y diacritics Types • Image sample ISSALE 2014

  7. Classification and recognition results • Recognition Results on DataSet-1 using Decision Trees • Main body recognition accuracy • Diacritics recognition accuracy • Recognition Results on DataSet-1 using Tesseract • Main body recognition accuracy • Diacritics recognition accuracy ISSALE 2014

  8. Classification and recognition results • Recognition Results on DataSet-2 using Decision Trees • Main body recognition accuracy • Diacritics recognition accuracy OR • Recognition Results on DataSet-2 using Tesseract • Main body recognition accuracy • Diacritics recognition accuracy ISSALE 2014

  9. Post-processing • Syllable String Creation • Syllable String Recognition Accuracy ISSALE 2014

  10. Output of OCR • Input Document Image • OCR Output ISSALE 2014

  11. Deliverables to submit • Presentation slides • OCR Complete Code • Line segmentation • Syllable segmentation • Recognition of diacritics and main bodies • Syllable string creation using lookup Table • Output.txt file generation • Data Set-1 • Data Set-2 • Tesseract Traineddata file ISSALE 2014

  12. Good Luck 

  13. Document Image Creation • Syllable_of_MB1_Samples_1 Syllable_of_MB2_Samples_1 Syllable_of_MB2_Samples_1 Syllable_of_MB3_Samples_1 Syllable_of_MB4_Samples_1 Syllable_of_MB5_Samples_1 ,,, Syllable_of_MB15_Samples_1 • Syllable_of_MB1_Samples_2 Syllable_of_MB2_Samples_2 Syllable_of_MB2_Samples_2 Syllable_of_MB3_Samples_2 Syllable_of_MB4_Samples_2 Syllable_of_MB5_Samples_2 ,,, Syllable_of_MB15_Samples_2 • Syllable_of_MB1_Samples_3 Syllable_of_MB2_Samples_3 Syllable_of_MB2_Samples_3 Syllable_of_MB3_Samples_3 Syllable_of_MB4_Samples_3 Syllable_of_MB5_Samples_3 ,,, Syllable_of_MB15_Samples_3 • Syllable_of_MB1_Samples_4 Syllable_of_MB2_Samples_4 Syllable_of_MB2_Samples_4 Syllable_of_MB3_Samples_4 Syllable_of_MB4_Samples_4 Syllable_of_MB5_Samples_4 ,,, Syllable_of_MB15_Samples_4 • , • , • , • Syllable_of_MB1_Samples_15 Syllable_of_MB2_Samples_15 Syllable_of_MB2_Samples_15 Syllable_of_MB3_Samples_15 Syllable_of_MB4_Samples_15 Syllable_of_MB5_Samples_15 ,,, Syllable_of_MB15_Samples_15 Syllable = MB + Diacritics or Syllable = MB ISSALE 2014

  14. Examples of Document Image ISSALE 2014

More Related