Handwritten Character Recognition

Authors: Ghanshyam Wadaskar, Vipin Bopanwar, Prayojita Urade, Shravani Upganlawar

DOI Link: https://doi.org/10.22214/ijraset.2023.57366

Abstract

Handwritten character recognition is a fascinating topic in the field of artificial intelligence. It involves developing algorithms and models that can analyze and interpret handwritten characters, such as letters, numbers, or symbols. The goal is to accurately convert handwritten text into digital form, making it easier to process and understand. It\'s a complex task, but with advancements in machine learning and deep learning techniques, significant progress has been made in this area.Handwritten character recognition is all about teaching computers to understand and interpret handwritten text. It involves using advanced algorithms and machine learning techniques to analyze the shapes, lines, and curves of handwritten characters. The goal is to accurately recognize and convert them into digital form. This technology has various applications, such as digitizing handwritten documents, assisting in automatic form filling, and enabling handwriting-based input in devices like tablets and smartphones. It\'s a fascinating field that combines computer vision, pattern recognition, and artifical intelligence.

Introduction

I. INTRODUCTION

Handwritten character recognition (HCR) represents a fundamental area of study within the broader field of pattern recognition and artificial intelligence. It involves the automated identification and interpretation of handwritten characters or symbols from various sources, ranging from historical manuscripts to contemporary handwritten forms. The significance of HCR stems from its pivotal role in digitizing handwritten documents, automating data entry, and enabling machine understanding of human-written information. The ubiquity of handwritten data across different domains necessitates reliable and efficient methods for recognizing and processing this diverse range of script styles, languages, and variations in writing. Traditionally, the process of recognizing handwritten characters involved a combination of feature extraction techniques and classification algorithms. These methods relied on extracting handcrafted features from the input images and then utilizing classifiers to distinguish and interpret the characters. However, with the advent of deep learning, particularly convolutional neural networks (CNNs) and recurrent neural networks (RNNs), there has been a paradigm shift in the approach to handwritten character recognition. Deep learning models have demonstrated remarkable capabilities in learning complex patterns and representations directly from raw pixel data, significantly reducing the need for explicit feature engineering. Despite the advancements in technology, challenges persist in HCR, including variations in writing styles, noise, distortions, irregularities in handwriting, and the need for robustness across different languages and scripts. Additionally, the scarcity of labeled datasets for specific handwriting styles poses a hurdle in training accurate recognition models. The applications of handwritten character recognition are diverse and impactful. They span various sectors such as document digitization, archival preservation, postal services, financial institutions (for processing checks and forms), as well as accessibility aids for individuals with disabilities. This introduction outlines the importance, challenges, and evolution of handwritten character recognition, emphasizing the need for innovative methodologies to overcome existing limitations. The fusion of traditional techniques with cutting-edge deep learning approaches holds promise for further advancements in this field, enabling more accurate, efficient, and adaptable handwritten character recognition systems.

II. LITERATURE REVIEW

Anuj Dutt in his paper demonstrated that utilizing Deep Learning systems, he had the capacity to get an extremely high measure of accuracy. By utilizing the convolutional Neural Network with Keras and Theano as backend, he was getting a accuracy of 98.72%. In addition, execution of CNN utilizing Tensorflow gives a stunningly better consequence of 99,70% Despite the fact that the complication of the procedure and codes appears to be more when contrasted with typical Machine Learning algorithms yet the accuracy he got is increasingly obvious In a paper published by Saeed AL-Mansoori, Multilayer Perceptron (MLP) Neural Network was implemented to recognize and predict handwritten digits from 0 to 9. The proposed neural system was trained and tested on a dataset achieved from MNIST.

These days, an ever-increasing number of individuals use pictures to transmit data. It is additionally main stream to separate critical data from pictures. Image Recognition is an imperative research area for its generally used applications. In general, the field of pattern recognition, one of the difficult undertakings is the precise computerized recognition of human handwriting. Without a doubt, this is a very difficult issue because there is an extensive diversity in handwriting from an individual to another individual. In spite of the fact that, this

III. METHODOLOGY

We used the MNIST dataset for training and testing our machine learning model. This dataset consists of 60,000 training images and 10,000 test images of handwritten digits from 0 to 9. We used a convolutional neural network (CNN) architecture to train our model on this dataset. Once the model was trained and tested, we saved it as a serialized object using the joblib library. We then developed a Flask-based web application that allows users to draw a digit using their mouse or touchscreen and submit the image to the model for recognition.

IV. SYSTEM DESIGN

Designing a handwritten character recognition (HWR) system involves several steps and considerations, from data acquisition and preprocessing to feature extraction and recognition algorithms. Here's a breakdown of the system design process:

A. Data Acquisition and Preprocessing

Gather a diverse dataset of handwritten character samples, ensuring adequate representation of different writing styles, fonts, and languages.
Preprocess the images to enhance their quality and ensure consistency. This may include noise reduction, normalization, and segmentation to isolate individual characters.

B. Feature Extraction

Extract features from the preprocessed images that capture the essential characteristics of the handwritten characters. Common feature extraction techniques include:

Structural Features: Analyze the shape, strokes, and connections of the characters.
Statistical Features: Utilize statistical properties like pixel density, distribution of gray levels, and moments.
Frequency-Domain Features: Employ techniques like Fourier transform or Zernike moments to capture patterns in the frequency domain.

C. Feature Selection

Select a subset of features that are most relevant and discriminative for the task of character recognition. This can involve dimensionality reduction techniques or feature evaluation methods.

D. Recognition Algorithms

Employ machine learning algorithms to classify the extracted features into the corresponding characters. Popular choices include:

Nearest Neighbors (NN): Classify based on the similarity of new features to known characters in the training set.
Support Vector Machines (SVM): Find a hyperplane that best separates the feature vectors of different characters.

V. WORKING

Working handwritten character recognition (HWR) involves a combination of data preparation, feature extraction, and machine learning algorithms. Here's a step-by-step guide to implementing a basic HWR system:

A. Data Preparation

Gather a dataset of handwritten character images. You can find publicly available datasets online or create your own by collecting handwritten samples from different individuals.
Preprocess the images to normalize their size, remove noise, and segment individual characters.

B. Feature Extraction

Extract features from the preprocessed images that represent the characteristics of the characters. Common feature extraction techniques include:

Structural Features: Analyze the shape, strokes, and connections of the characters.
Statistical Features: Utilize statistical properties like pixel density, distribution of gray levels, and moments.
Frequency-Domain Features: Employ techniques like Fourier transform or Zernike moments to capture patterns in the frequency domain.

C. Feature Selection

Select a subset of features that are most relevant and discriminative for the task of character recognition. This can involve dimensionality reduction techniques or feature evaluation methods.

D. Machine Learning Algorithm

Choose a machine learning algorithm to classify the extracted features into the corresponding characters. Popular choices include:

Nearest Neighbors (NN): Classify based on the similarity of new features to known characters in the training set.
Support Vector Machines (SVM): Find a hyperplane that best separates the feature vectors of different characters.
Neural Networks: Utilize artificial neural networks to learn complex patterns in the features and classify characters effectively.

E. Training and Evaluation

Split the dataset into training and testing sets.- Train the selected machine learning algorithm using the labeled training data.- Evaluate the performance of the trained model on the test dataset to assess its generalization ability.

VI. RESULT

Our machine learning model achieved an accuracy of 99.1% on the MNIST test set. When integrated with the Flask application, the model is able to accurately recognize handwritten digits drawn by users in real-time

Conclusion

In this we have seen that the model is adequately capable of correct detection in the domain of machine generated images and also training it with handwritten images, which we have not done for the purpose of our main work, would indeed yield a decent accuracy way more than 70%. The data gathered from MNIST for training our model contains all the varieties of images that we may encounter in real life scenario our model attained the accuracy of 92% which can increase with high system capabilities. Another viewpoint is this upgraded machine produced dataset owes its viability to the immense abilities of convolution layers. Neural systems are moving progressively and more toward End- to-End approaches and one of the primary ventures toward this way is highlight extraction stage, which normally restricts the extent of the model and was an unwieldy errand, is presently hindered utilizing convolution neural systems.

References

[1] Flying Si technology product research and development center. Neural network theory and [M]. to achieve MATLAB7 Electronics Industry Press, 2005:5-68 [2] Mao bin Tang, Xie Yuping, Li Qing. Based on neural network algorithm of character recognition method [J]. Microelectronics and computer 2009 (8) [3] Ceng Zhijun, Sun Guoqiang, digital character recognition based on improved BP network [J]. Journal of University of Shanghai for Science and Technology, 2008.30 (2): 201-204 [4] Zhou Kaili, Kang Yaohong. The neural network model and MATLAB simulation program design of [M]. Tsinghua University press, 2004:4-25 [5] Chen Lei, Chen, Xing Rong Zhong, Wang Jiajun. Based on Improved BP algorithm for the digital character recognition [J]. Microelectronics and computer 2004 (12):(12):127-130 [6] Liu Hui, Yu Yanmei, Luo Daisheng. Momentum BP neural network character recognition based on English [J]. Journal of Sichuanb University, 2011 (6): 1324-05

Copyright

Copyright © 2023 Ghanshyam Wadaskar, Vipin Bopanwar, Prayojita Urade, Shravani Upganlawar, Prof. Rakhi Shende. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET57366

Publish Date : 2023-12-05

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here