Design of an OCR System and its Hardware Implementation

Authors: Gulfeshan Parween, Dr. Satadal Saha

DOI Link: https://doi.org/10.22214/ijraset.2021.39217

Abstract

In this paper, we present a scheme to develop to complete OCR system for printed text English Alphabet of Uppercase of different font and of different sizes so that we can use this system in Banking, Corporate, Legal industry and so on. OCR system consists of different modules like preprocessing, segmentation, feature extraction and recognition. In preprocessing step it is expected to include image gray level conversion, binary conversion etc. After finding out the feature of the segmented characters artificial neural network and can be used for Character Recognition purpose. Efforts have been made to improve the performance of character recognition using artificial neural network techniques. The proposed OCR system is capable of accepting printed document images from a file and implemented using MATLAB R2014a version.

Introduction

I.INTRODUCTION

In today’s world of information, countless of data and forms, reports, letters, and contracts are generated each and every day hence, the need to retrieve, archive, update and distribute printed documents has become increasingly important [2, 4]. An available technology that automates these tasks on computer media is optical character recognition (OCR) which can transformed printed documents into ASCII character so that a computer can recognize, which enable compact storage, editing, fast retrieval, and other file manipulations through the use of a computer. Optical character recognition systems are useful for automatically reading the contents of a document for storage in a computer memory. This system is the base for many different types of applications in various fields, many of which we use in our daily lives. Cost effective and less time consuming, businesses, post offices, banks, security systems, and even the field of robotics employ this system as the base of their Operations. The document image itself can be either machine printed scanned image or an image captured by a camera or by a mobile phone. Computer system equipped with such an OCR system can improve the speed of input operation and decrease some possible human errors. Recognition of printed characters is itself a challenging problem since there is a variation of the same character due to change of fonts or introduction of different types of noises. Difference in images shape and sizes makes recognition task difficult if preprocessing, feature extraction and recognition are not robust. There may be noise pixels that are introduced due to scanning of the image. Therefore, a good character recognition approach must eliminate the noise after reading binary image data, smooth the image for better recognition, extract features efficiently, train the system and classify patterns.

II. LITERATURE REVIEW

In [1] the paper surveys the research and development of OCR system from historic point of view. The paper mainly divided into parts - Research and Development of OCR. The R & D part mainly include Template Matching and Structure Analysis. This paper also comments on the recent technique like Expert System, Neural Network applied to the OCR. This paper deals with Template Matching Algorithm and its compatibility. In paper [2] the main theme of paper is new feature extraction method employed in online Arabic character recognition. An Arabic character recognition handwritten system cannot be successful, without using suitable feature extraction methods. In this work we have proposed the hybrid Edge Direction Matrixes and geometrical feature extraction method for on-line handwritten Arabic character recognition system. In addition, horizontal and vertical projection profile, and Laplacian filter have been used in the preprocessing phase. The paper had revealed that the proposed method gives best recognition rate for character category. The paper [3] describes a character recognition methodology that achieves high speed and accuracy by using a multi resolution and hierarchical feature space. Features at different resolutions are implemented by means of a recursive classification scheme. Typically, recognizers have to balance the use of features at many resolutions (which yields a high accuracy), for the burden on computational resources in terms of storage space and processing time. This paper present a method that adaptively determines the degree of resolution necessary in order to classify an input pattern that lead to optimal use of computational resources. The Hierarchical OCR dynamically adapts to factors such as the quality of the input pattern, its intrinsic similarities and differences from patterns of other classes it is being compared against, and the processing time available. Recognition rate of about 96 % is achieved by the Hierarchical OCR.

In [4] a method for image segmentation from printed document is presented. Segmentation is typically used to trace the object and boundaries such as line and curves in an image. The segmentation of the text reliability is necessary to perform the classification and Recognition. The main aim of segmentation is to partition the document image into various homogeneous regions such as text block, image block, line and word. In this paper we have introduced a clustering based neighbor method and Direction based line segmentation method for the image segmentation. This Paper result the segmentation of documented image using various algorithm.In [5] authors present a simple method using a self-organizing map neural network (SOM NN) which can be used for character recognition tasks. It describes the results of training SOM NN to perform optical character recognition on images of printed characters. 49 features have been used to distinguish between 62 characters (both uppercase and lowercase letters of the English language and numerals). The implemented program recognizes text by analyzing an image file. The text to be recognized is currently limited to characters typed using Verdana font type, bolded with a font size of 18. The program is capable of handling non-ideal images (noisy, colored text, rotated image). Recognition accuracy is consistently 100% for ideal consisted of three layers with 680 input and 26 output images, but ranges between 80% - 100% for non-ideal images.In [6] Neural Networks are being used for character recognition. This paper presents creating the Character Recognition System, in which Creating a Character Matrix and a corresponding Suitable Network Structure is key. The Feed Forward Algorithm gives insight into the enter workings of a neural network, followed by the Back Propagation Algorithm which compromises Training, Calculating Error, and Modifying Weights. This paper made an attempt to recognize handwritten English characters by using a multilayer perceptron with one hidden layer.

III. OPTICAL CHARACTER RECOGNITION

A. Overview of an OCR System

Optical Character Recognition deals with the problem of recognizing optically processed characters. Optical recognition is performed off-line after the writing or printing has been completed, as opposed to on-line recognition where the computer recognizes the characters as they are drawn. Both hand printed and printed characters may be recognized, but the performance is directly dependent upon the quality of the input documents. The more constrained the input is, the better will the performance of the OCR system be. However, when it comes to totally unconstrained handwriting, OCR machines are still a long way from reading as well as humans. However, the computer reads fast and technical advances are continually bringing the technology closer to its ideal. OCR systems may be subdivided into two classes. The ?rst class includes the special purpose machines dedicated to speci?c recognition problems. The second class covers the systems that are based on a PC based software and a low-cost scanner.

DATABASE DEVELOPMENT

In this paper, 26 uppercase alphabets characters were recognized by OCR system, from 'A' to 'Z'. The uppercase English alphabets are of different sizes and different fonts. The respondents collected for this study came from various fields. For the Database we have taken 26 characters uppercase English alphabets of different sizes like font size of 20, 36, 14, etc and different fonts like vardana, century, Time roman, constantia, etc.

A. Experimental Datavase

1. Creation of templates: 2.'A' to ' Z'. 3.These databases are created for 10 different fonts and 5 different sizes. 4.Here total number of samples:- 26*5*10 = 1300 character samples.

B. Database Templates

1.Database 1

2. Database 2

3. Database 3

4. Database 4

5.Database 5

6. Database 6

7. Database 7

8. Database 8

9. Database 9

10. Database 10

V. DESIGN AND IMPLEMENTATION OF PROPOSED OCR SYSTEM

A. Proposed Drsign Of An Ocr System

An OCR System typically consist of following processing steps. survey paper [2], [3], [13] and book [15], and evaluation studies cover most of these subtask.

Pre-processing
Segmentation
Feature Extraction
Character Recognition

VI. EXPERIMENTAL RESULTS AND DISCUSSIONS

A. Overview of the Experimental Processing Steps of OCR and Results

An implementation and experiment has been done to measure performance of algorithm used for the Segmentation (CCA) and Feature Extraction Technique (Quad Tree or Quin Tree). In the experimental results, it shows that the approach has a good performance in terms of accuracy, the time consumed and the simplicity of the algorithm. Experimental results shows the different processing steps of image of printed English Alphabet 'A'.

1.Original Image to Gray Level Conversion

2. Gray Level to Binary Conversion

3. Binary Image to Image Cropping

4. Connected Component Labeling of an Image

B. Character Recognition using NPRTOOL in MATLAB

1. Data Set: The dataset consists of 260 images of uppercase English alphabets of various fonts like Calibri, Verdana, Constantia, Time roman, Century, Georgia, Cambria and of different font sizes. These were divided into training, validation and testing sets. The training set consists 70% of data i.e. 182 samples of 260 samples. These are presented to the network during training and network is adjusted according to its error. The Validation consists of 15% of data set i.e. 39 samples of 260 samples. These are used to measure network generalization and to halt training when generalization stops improving. Testing requires 15% of data set i.e. 39 samples. These have no effect on training so provide an independent measure of network performance during and after training.

2. Network Training Phase: The training phase consists of computing the 5-element feature vectors from each of the 260 images of the training set. Training of Network is done using scaled conjugate gradient back propagation learning. Training of data automatically stops when generalization stops improving.

3. Classification: This is done using MLP (multi-layer perceptron) [5]. The MLP consists of 5 inputs for feeding in the 5-element feature vector for each character, and 26 outputs for discriminating between the characters. The activation transfer functions are of log-sigmoid type. The Performance obtained after Error Histogram, ROC( Receiver Operating Characteristic) are shown in figures.

VII. EXPERIMENTAL RESULTS OBTAINED FOR ANN

1. Performance Plot: The Performance Plot indicates the iteration at which the validation performance reached a minimum. Since from the plot it is clear that the Validation curve and Test curve are very similar to each other so it does not shows problem. According to plot it is clear that best Validation Performance is obtained at 35 epoch is 0.066043 as shown in fig.

2. Training State Plot and Error Histogram Plot: The Training State plot obtained by nprtool in MATLAB shows the progress of other training variables such as gradient magnitude, no. of validation checks etc. The Error Histogram plot shown in Fig shows the distribution of network error.

3. ROC Plot: An ROC (Receiver operating Characteristic) is a plot of operating point showing the trade off between a Classifier TP(True Positive) rate and FP(False Positive) rate. It check the quality of Classifier. True Positive Rate is the percentage of target samples that are correctly classified while False Positive Rate is percentage of non target samples that are incorrectly classified. The ROC plot obtained shows the accuracy of ANN classifier as it hugs the left and top edges mostly. There were little errors as the no of training dataset is less. If we increase the dataset the output will be more accurate. Figure below shows the ROC plot of all Training, Validation and Testing data set.

a. Neural Network Simulink Diagram

b. Snap Shot Of Simulation Diagram obtained using MATLAB: If we give Input1: [0.9;0.57;0.85;0.74;0.59] and sampling time is '1' the output y1 is shown in the simulation diagram. Likewise on changing the input we can get the desired output.

4. Experimental Result obtained for Hardware Implementation of ANN Sigmoid Function

a. Technology Schmetic of Sigmoid Function of ANN

b. HDL Synthesis Report Macro Statistics

# Multipliers	: 1
9x9-bit multiplier	: 1
# Adders/Subtractors	: 1
9-bit adder	: 1
# Latches	: 1
9-bit latch	: 1
# Comparators	: 2
9-bit comparator greater	: 1
9-bit comparator less	: 1

c. Devices Used For Hardware Implementation

VIII. RESULT OF FPGA IMPLEMENTATION OF SIGMOID FUNCTION

Specification of the FPGA Spartan 3E kit: The Spartan-3E High Volume Starter Kit gives designers instant access to the complete platform capabilities of the Spartan-3E family. The Spartan®-3E FPGA Starter Kit is a complete development board solution giving designers instant access to the capabilities of the Spartan-3E family. Complete kit includes board, power supply, evaluation software, resource CD (application notes, white papers, data sheets, etc.), and USB cable. The Spartan 3E Starter Board provides a powerful and highly advanced self-contained development platform for designs targeting the Spartan 3E FPGA from Xilinx. It features a 500K gate Spartan 3E FPGA with a 32 bit RISC processor and DDR interfaces

A. Constituent of FPGA kit

Development board
Universal power supply 100-240V, 50/60 Hz
ISE® WebPACK™ software, ISE Foundation™ software evaluation, and the Embedded Development Kit (EDK)
Handbook: Introduction to Programmable Logic Design Quick Start
Starter Kit resource CD
USB cable

B. Key Features of Spartan 3E

Spartan-3E FPGA (XC3S500E-4FG320C)
CoolRunner™-II CPLD (XC2C64A-5VQ44C)
Platform Flash (XCF04S-VO20C)
Clocks: 50 MHz crystal clock oscillator
Memory of RSA
128 Mbit Parallel Flash
16 Mbit SPI Flash
64 MByte DDR SDRAM
Connectors and Interfaces:
Ethernet 10/100 Physics.
JTAG USB download
Two 9-pin RS-232 serial port
PS/2- style mouse/keyboard port, rotary encoder with push button
Four slide switches
Eight individual LED outputs
Four momentary-contact push buttons
100-Pin expansion connection ports
Three 6-pin expansion connectors
Display: 16 character - 2 Line LCD

Conclusion

The Three main phase of the work - Preprocessing of the image had been done based on the common preprocessing method. The second phase i.e. Segmentation we used Connected Component Algorithm. This approach promotes speed, accuracy and simplicity. We have shown code and examples in the 8-connectivity CCA only. In the feature extraction phase we used Quin Tree method which is very simple and have high accuracy. The most important part i.e. Recognition of character is done using ANN in MATLAB. Second Part of project include the Hardware Implementation of Activation Function i.e. Sigmoid Function Of ANN. Based on the results obtained, it had proved that the proposed method had produced the best accuracy rate. The paper also present the comparison of simulink behavior of software and Hardware using FPGA. Thus this paper present an Overview design phases of an OCR system, different methodology used for best performance and its application in the different field of areas.

References

[1] \"Historical Review of OCR Research and Development\" by S. Mori, Member IEEE, Ching Y. Suen, Fellow IEEE, Proceeding IEEE, 80, No 7, July 1992. [2] \"Geometrical-matrix feature extraction for on-line handwritten characters recognition\" by Saad M. Ismail, Siti Norul Huda Sheikh Abdullah, Journal of Theoretical and Applied Information Technology, 10th March 2013. Vol. 49 No.1. [3] \'\'Optical Character Recognition\" by Ravina Mithe, Supriya Indalkar, Nilam Divekar International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-2, Issue-1, March 2013. [4] \"Text Detection From Documented Image Using Image Segmentation\" by Santosh, Dr. Jenila Livingston L.M., Research Scholar at VIT, Chennai., International Journal of Technology Enhancements and emerging engineering research, vol 1, issue 4 ISSN 2347-4289 , 2013. [5] \"Optical Character Recognition Program for Images of Printed Text using a Neural Network\" by Velappa Ganapathy, Charles C. H. Lean ,School of Engineering, Monash University Malaysia. [6] \"Handwritten English Character Recognition using Neural Network\" by Vijay Patil and Sanjay Shimpi Department of Computer Engineering, Vidyalankar Institute of Technology, Wadala, Mumbai, International Journal of Computer Science & CommunicationVol. 1, No. 2, July-December 2010. [7] \"Document Analysis and Recognition (ICDAR)\" by Peng Ye, Language & Media Process Lab, Univ. of Maryland, USA. 12th International Conference on, 2013. [8] \"A Detailed Review of Feature Extraction in Image Processing Systems \" by Kumar, G. Bhatia, P.K., 2014. Fourth International Conference. Publication Year: 2014. [9] \"Transactions on Pattern Analysis and Machine Intelligence\" by Venu Govindaraju, Senior Member, IEEE, and Sargur N. Srihari, Fellow( IEEE), IEEE , Vol. 22, No. 4, April 2000. [10] \"Segmentation of Touching Character in Printed Devnagari and Bangla Script Using Fuzzy Multi factorial Analysis\" by Utpal Garain and Bidyut B. Chaudhary, IEEE Transaction on System, Man and Cybernetics- Part C: Applications and Reviews, 32, November 2002. [11] \"Object Recognition System using Template Matching Based on Signature and Principal Component Analysis\" by Inad A. Aljarrah, Ahmed S Goraib & Ismail M. Akhter, IJDIWC, 2012. [12] \"OCR Error Detection and Correction of an Inflectional Indian Language Script\" by B. B. Chaudhary and U. Pal , IEEE Proceeding of 13th International Conference on 25-29 Aug., 3, 1996. [13] \"A brief review and survey of feature extraction methods for Devnagari OCR\" by Holambe A.N. , Thool, R.C. Jagade, S.M. ,ICT and Knowledge Engineering (ICT & Knowledge Engineering), 2011 9th International Conference, Digital Object Identifier: 10.1109/ICTKE.2012.6152421 Publication Year: 2012 . [14] \"Script Identification from Indian Documents\" by G.D. Joshi, S. Garg and J. Sivaswamy Proc. IAPR Workshop Document Analysis Systems, Feb. 2006. [15] \"A Devnagari OCR and A Brief Overview of OCR for Indian Script\" by Veena Bansal and R.M.K. Sinha, PROC Symposium on Transaction support System (STRANS 2001), Feb. 15-17, 2001, Kanpur, India. [16] \"Digita Image Processing using MATLAB\" by Rafael C. Gonzalez, Richard E. Woods, Steven L. Eddins . [17] \"Optical Character Recognition\" Line Eikvil , December 1993. [18] \"Sigmoid Function Approximation for ANN Implementation in FPGA Devices\" by Djalal Eddine Khodja1 ,Aissa kheldoun2, and Larbi refoufi. [19] \" Character Recognition System\" by Mohamed Cheriet, Nawwaf Kharma, Cheng-LIN LIU and Ching, John Wiley & Sons,Inc., Hoboken, NewJersey, ISBN 978-0-471-41570-1, 2007. [20] \" Neural Network Implementation Using FPGAs\" by Dhirajkumar S. Jinde et al, / (IJCSIT) International Journal of Computer Science and Information tech.,2015.

Copyright

Copyright © 2022 Gulfeshan Parween, Dr. Satadal Saha. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET39217

Publish Date : 2021-12-02

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here