Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Asst. Prof. Moumita Dey, Trisha Das, Moushikta Shit, Debjit Rana, Papiya Biswas
DOI Link: https://doi.org/10.22214/ijraset.2024.61902
Certificate: View Certificate
The present paper proposes a novel algorithm for recognition of handwritten digits. For this, the present paper classified the digits into two groups: one group consists of blobs with/without stems and the other digits with stems only. The blobs are identified based on a new concept called morphological region filling methods. This eliminates the problem of finding the size of blobs and their structuring elements. The digits with blobs and stems are identified by a new concept called ‘connected component’. This method completely eliminates the complex process of recognition of horizontal or vertical lines and the property called ‘concavities’. The digits with only stems are recognized, by extending stems into blobs by using connected component approach of morphology. The present method has been applied and tested with various handwritten digits from modified NIST (National Institute of Standards and Technology) handwritten digit database (MNIST), and the success rate has been given. The present method is also compared with various existing methods. Once the digits are recognized, they are assembled back into their respective positions within the equation. Mathematical equation solving techniques are then applied to evaluate the expression and obtain the result. Experimental results demonstrate the effectiveness of the proposed approach in accurately recognizing handwritten digits within mathematical equations. The method achieves competitive performance compared to state-of-the-art techniques, even in cases with complex equations and varied writing styles. This approach has potential applications in various domains such as education, document processing, and automated grading systems. By accurately interpreting handwritten digits within mathematical expressions, it can facilitate automated analysis of mathematical documents, assist students in learning mathematics, and streamline administrative tasks in educational institutions.
I. INTRODUCTION
Artificial intelligence (AI) in simple words is basically making a computer do the work that traditionally requires the human brain. AI has the ability to take in large amounts of data unlike the human and uses that data to recognize patterns, make decisions, and give judgment. In this AI we have a subset which is called Machine learning. ML is used to make computers to learn to behave ashumans. This is done by two ways, supervised learning in which the computer is given a set of input data and the required outputfor it. Now it uses ML to learn the algorithm to understand how that particular input gives this particular output. Now the unsupervised learning is when input data is provided but with no output, so the ML has to learn to analyze and clutter the datasets into categories.
This report presents a novel approach to address the issue of recognizing handwritten digits within mathematical equations. The proposed method integrates deep learning techniques with mathematical equation solving strategies to achieve accurate digit recognition and equation solving simultaneously. The motivation behind this work stems from the practical significance of automating the analysis of mathematical documents. In various domains such as education, research, and administrative tasks, the ability to interpret handwritten mathematical expressions efficiently can streamline processes and improve productivity.
The following are the main contributions of the proposed technique:
This report is structured as follows: Section 2 provides an overview of related work in the field of handwritten digit recognition and mathematical equation solving. Section 3 describes the methodology and the proposed approach in detail. Section 4 presents experimental results and performance evaluation. Finally, Section 5 concludes the report with a summary of findings and potential avenues for future research. [1]
II. PROJECT OBJECTIVES
The central aim of this project is to develop an assistance system that analyses the hand- written mathematical equations based on handwriting recognition algorithms. The system will be able to recognize images of handwritten equations and output the corresponding characters in LATEX. The specific objectives include:
These objectives collectively aim to bridge the gap between traditional handwrit- ten mathematical expressions and digital formats, making mathematical content more accessible, editable, and usable across a wide array of applications and platforms. [2]
III. LITERATURE REVIEW
A brief description of the contributions of this thesis is given below:
In the realm of optical character recognition (OCR) systems, Shah and Gokani (2018) introduced an effective approach tailored for digit recognition. Their work, titled ”A Simple and Effective Optical Character Recognition System for Digits Recognition Using the Pixelcontour Features and Mathematical Parameters,” focuses on leveraging pixelcontour features and mathematical parameters to build an efficient OCR system specifically for recognizing digits.
Additionally, Simard, Steinkraus, and Platt (2015) provided invaluable insights in their paper titled ”Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis”. This work outlines the best practices and methodologies specifically designed for applying convolutional neural networks (CNNs) in the domain of visual document analysis, offering crucial guidelines for enhancing CNN performance in this context. [6]
These cited works collectively underscore the significance of neural networks, partic- ularly CNNs, in document image preprocessing, optical character recognition, and visual document analysis, contributing to the advancement of efficient methodologies and state-of-the-art practices in these domains.
A. Existing System
Various algorithms used for implementing handwritten digit recognition systems consist of Proximal Support Vector Machine (PSVM), Multilayer Perceptron, Support Vector Machine (SVM), Random Forest, Bayes Net, Naive Bayes, J48, Random Tree. [7]
Even though these algorithms may prove to be useful in some of the applications based on this technology, many other applications such as banking industry applications require better results which can be achieved using other algorithms as compared to the algorithms that are mentioned above.
B. Proposed System
To reduce error and obtain more efficiency overall, Convolutional Neural Network (CNN) can be used to implement handwritten digit recognition systems. For achieving so, our proposed system uses CNN with multiple pooling and convolutional layers alongside a kernel of 3x3 size. Our model uses 60,000 28*28 grayscale images during the training process. Our model is trained through a standard 5 epochs to achieve accuracy of the order of 99.16% which is much higher as compared to the traditional algorithms such as SVM, Multilayer Perceptron, Bayes Net, Random Forest, etc. used to implement handwritten digit recognition systems. [8]
???????IV. METHODOLOGY
Importing the libraries: Libraries are useful tools that can make a web developer’s job more efficient. It’s a set of prewritten code, that we can call while programming our own code. Basically, it’s the work that’s already done by someone else that you can make use of, without having to do it yourself. You can also use it in your own code. Different libraries have different restrictions on fair use, but this is a code that was designed to be used by others, instead of just standing alone. The libraries used in this code are -
The project consists of seven chapters, and the organization of the project is as follows: There are ten digits in English language and each digit is differentiated from the other digits by some characteristic feature(s). Recognition of the ten numerals appears simple at first. However, the problems that arise due to similarities between different numerals and discre-pancies between the same numeral must be tackled by analysing the similar and dissimilar features and then decisions should be made accordingly.
The present paper divided the ten digits of English language into two groups. Group 1 consists of digits with blobs with/without stems. This group consists of digits 0, 4, 6, 8, and 9. Group 2 consists of digits with only stems, digits 1, 2, 3, 4, 5, and 7. The group 1 is further divided in to two subgroups i.e. the digits with only two blobs 8and another with a single blob with or without stems0, 4, 6 and
The blobs are identified by region filling method which is different fromprevious methods. It is used to eliminate extra problems due to non-class-specific differences like
a. Size
b. Shear
c. Line thickness
d. Background and Digit colors
e. Resolution, etc.
???????
B. Digit Recognition with Convolutional Neural Networks (CNNs)
C. Equation Reconstruction
D. Integration and Post-processing
E. Model Evaluation
F. Implementation Considerations
G. Optional Equation Simplification
H. Simplification
V. FLOWCHART
???????VII. IMPLEMENTATION
In this paper, Neural network is implemented wherein the model recognizes and predicts a handwritten digit. Initially Tensorflow and Keras are used to form the bones of the implementation. We load the datasets from both of these open-source libraries and make our model to analyze thousands of images. The model learns all the patterns, pixel placements of the greyscale images and all the neural connections. Keras is an API (Application Programming Interface), which is designed for machine learning and deep learning. It’s an open source library which has a lot of inbuilt data. It’s the interface of Tensorflow library. [3]
???????A. Implementation Details
A mainloop is created for the master window which is run infinitely until the user shuts the window down. A title related to the proposed project is then provided to the main GUI window. The main window consists of two buttons namely ‘Recognize Number’ and ‘Clear Canvas’.
After that the functions to implement functionalities such as clearing the canvas, drawing the digits, activation event for doing so, and digit recognition are defined.
To recognize the handwritten user-defined digit strings on the canvas displayed within the GUI, a list of contours i.e., a line that connects every point around the borderline of an image which has similar intensity, is created which is very useful for detecting the digit and analyzing its shape.
The ‘Recognize Number’ button is then pressed which initiates the model to predict each and every digit one by one. It displays the result in a new window where each digit is recognized separately and the accuracy with which they are recognized is also displayed. This new window is given the required title and consists of three other options alongside the recognized number. These three options ultimately provide the functionality of converting the recognized decimal number to binary/hexadecimal/octal number system according to the user’s choice. [11]
???????VIII. RESULTS AND DISCUSSION
In this section, we have provided a detailed description of the employed dataset along with the metrics which are used to assess the performance of the proposed model. Moreover, we have performed a series of experiments to check the numeral detection and classification performance of the presented approach.
We have implemented the proposed method in Python language and executed it on an Nvidia GTX1070 GPU-based system. Table 2 displays the details of the training parameters of the proposed work. We have reported training and loss curves to show the optimized learning behavior of the proposed approach [12]
???????IX. FUTURE WORK
A new method can be proposed to cut or segmenting the digit strings still there are some limitation for this method, where improvements has to be made. Thus, there is a place for some future work such as: Different classifications models can be used at a time to improve the performance of the segmentation. To reduce the complexity of the algorithm, it’s better to reduce the number of hypothesis to function the algorithm faster. To reduce the computation time, better filters are to be used to eliminate the unnecessary segmentation hypothesis.
Firstly, to have more compelling and robust training, we could apply additional prepro- cessing techniques such as jittering. We could also divide each pixel by its corresponding standard deviation to normalize the data. Next, given time and budget constraints, we were limited to 20 training examples for each given word in order to efficiently evalu- ate and revise our model. Another method of improving our character segmentation model would be to move beyond a greedy search for the most likely solution.We would approach this by considering a more exhaustive but still efficient decoding algorithm such as beam search. We can use a character/word-based language-based model to add a penalty/benefit score to each of the possible final beam search candidate paths, along with their combined individual softmax probabilities, representing the probability of the sequence of characters/words. If the language model indicates perhaps the most likely candidate word according to the softmax layer and beam search is very unlikely given the context so far as opposed to some other likely candidate words, then our model can correct itself accordingly. Now, let’s look ahead. There’s a lot we can do to make this project even better. We want the system to understand different kinds of handwriting and trickier math problems. Making it easier for everyone to use, like on phones and in different languages, is also important. We’re working on ways for the system to learn from its mistakes and get even better at recognizing and solving math problems. These improvements will help make the system more helpful and user-friendly for everyone.
Accurate recognition of numerals from images plays a significant role in the domain of information processing. However, a huge writing pattern difference and the presence of various sample distortions like noise, blurring, and intensity changes complicate the effective detection of HDR. In this work, a reliable DL-based HDR system, namely, EfficientDet-D4, is presented to resolve the existing issues of this domain. More clearly, input images are initially annotated to locate the position of digits on images, which are later used to train the EfficientDet model to detect and categorize the digits. We have evaluated the presented approach over the complex dataset, namely, MNIST, and attained an average accuracy value of 99.83%. We have confirmed through huge experimentation that the presented work can efficiently recognize the numerals from the test samples and categorize them into 10 categories showing numbers from 0 to 9. Moreover, the approach is capable of accurately identifying and classifying the digits even under the occurrence of various postprocessing attacks, i.e., light and color variations, blurring, noise, angle and size changes, etc. Furthermore, across-dataset evaluation on the USPS dataset is also accomplished to show the efficacy of the proposed method for the unseen cases. Evaluation results have assured that the introduced approach is robust against present modern techniques and can play a vital role in the area of information processing. Based on the computed results, we can say that this approach can play an important role in the area of automated number plate recognition of vehicles for surveillance applications. Furthermore, this work has an application in optical character recognition to facilitate various daily life tasks, i.e., product prices, receipt recognition, etc. In the future work, we plan to extend the proposed approach to be applied to other languages. . With the results given from this work we are more confident in finding other ways to make this better and to make it easier for complex data like converting handwritten paragraphs into text. Through this research work we understood all the mechanisms used to identify handwritten data. We understand the importance of hand recognition as it is easy for the user to write data on paper and use handwritten data recognition to convert it into text instead of the typing it on keyboard. Further it is recommended to implement on edge computing platforms like Raspberry Pi 4 system for actual usage. [11]
[1] M. F. Bin Othman and T. M. S. Yau. Comparison of different classification techniques using weka for breast cancer. In 3rd Kuala Lumpur Internanal Conference on Biomedical Engineering 2006, pages 520–523. Springer Berlin Heidelberg, 2006 [2] R. R. Bouckaert. Properties of bayesian belief network learning algorithms. In Pro- ceedings of the Tenth international conference on Uncertainty in artificial intelligence, pages 102–109. Morgan Kaufmann Publishers Inc., 1994. [3] W. Buntine. Theory refinement on bayesian networks. In Proceedings of the Seventh conference on Uncertainty in Artificial Intelligence, pages 52–60. Morgan Kaufmann Publishers Inc., 1991. [4] Fotini Simistira, Vassilis Katsouros, and George Carayannis. A template matching distance for recognition of on-line mathematical symbols. Proceedings of the 11th International Conference on Frontiers in Handwriting Recognition, 01 2018. [5] Fotini Simistira, Vassilis Papavassiliou, Vassilis Katsouros, and George Carayannis. A system for recognition of on-line handwritten mathematical expressions. 09 2022. [6] Ahmad Montaser Awal, Harold Mouch‘ere, and Christian Viard-Gaudin. Towards handwritten mathematical expression recognition. 2009 10th International Conference on Document Analysis and Recognition, pages 1046–1050, 2019 [7] Utpal Garain and Bhabatosh B. Chaudhuri. Recognition of online handwritten math- ematical expressions. IEEE Transactions on Systems, Man, and Cybernetics. Part B, Cybernetics : A Publication of the IEEE Systems, Man, and Cybernetics Society, 34(6):2366–2376, 2014. [8] A. Rehman and T. Saba. Neural networks for document image preprocessing: state of the art. Artificial Intelligence Review, 42(2):253–273, 2014. [9] J. Shah and V. Gokani. A simple and effective optical character recognition system for digits recognition using the pixelcontour features and mathematical parameters. (IJCSIT) International Journal of Computer Science and Information Technologies,5(5), 2014. [10] P. Y. Simard, D. Steinkraus, and J. C. Platt. Best practices for convolutional neural networks applied to visual document analysis. In Institute of Electrical and Electronics Engineers, Inc., August 2023. [11] R. Kruse, C. Borgelt, F. Klawonn, C. Moewes, M. Steinbrecher, and P. Held. Mul- tilayer perceptrons. In Computational Intelligence, pages 47–81. Springer London, 2013. [12] H. H. Zhao and H. Liu, “Multiple classi fiers fusion and CNN feature extraction for handwritten digits recognition, ” Granular Computing, vol. 5, no. 3, pp. 411–418, 2020.
Copyright © 2024 Asst. Prof. Moumita Dey, Trisha Das, Moushikta Shit, Debjit Rana, Papiya Biswas. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET61902
Publish Date : 2024-05-10
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here