Air Writing Word Recognition and Translation

Authors: Ms. Neha Ganesh Karande, Ms. Ankita Chandrakant Chorage, Ms. Tejaswini Dilip Chavan, Ms. Vidya Pravin Bansode, Ms. Shifa Shikalgar

DOI Link: https://doi.org/10.22214/ijraset.2024.59099

Certificate: View Certificate

Abstract

Project develops a system that can recognize the air-written words in free space, and then classify the recognized word. Air-writing is the new way of writing the linguistic words in free area using hand or finger movements. This project could be a combination of computer vision hand chase and handwriting recognition machine learning. The air writing recognition system uses the web camera of a pc to trace word written within the air by the user and then uses a convolutional neural network to classify the word into one of the possible classes. Several current systems use advanced and high-priced chase setups to realize gesture recognition, however, we tend to get to form a system that may attain a similar work with a far cheaper setup.

Introduction

I. INTRODUCTION

Air writing word recognition and translation using a Convolutional Neural Network (CNN) is a technology that allows the recognition of handwritten word formed in the air. This approach leverages a CNN model trained on air writing samples to accurately identify individual words. By employing word segmentation techniques, the recognized words can be organized into words. Moreover, with the aid of translation mechanisms, like lookup tables or machine translation models, the identified words can be efficiently translated from English to Hindi, facilitating seamless communication. This technology holds immense potential in enabling intuitive and real-time translation of air-written text, revolutionizing human-computer interaction and language barriers.

The air writing word recognition system can be used a Convolutional Neural Network (CNN) to analyze and classify gestures made by the user in the air. extracting spatial information from the input data, such as the trajectory and shape of the written characters, the CNN enables precise recognition and interpretation of the air-written word.

II. LITERATURE REVIEW

T. Watanabe, Md. Al. M. Hasan, et. al. (2023) [1] have proposed air writing is an interaction between Human and computer which allows to write words with finger in air in simple manner. In air writing gesture recognition, it can be matched to characters and digits which writes in air. It shows variations which depends on different writing styles of user which presents difficult task for recognition. To solve this problem, it proposed air writing system which uses web camera. It contains to types: first alphabetic recognition and second one is digit recognition. It can be used two datasets: alphabetic data set and numeric data set. In this proposed system it collected samples from 16 users in which contains A to Z character and 0 to 9 digits which written by users about 5-10 times. At a time it recorded positions of fingers using Media Pipe. At the end it collected 3166 samples for alphabetic and 1212 samples for digit data set. Identification of words is 75% accurate.

K. Navya, S.A. Amreen, et.al.( 2023) [2] have proposed Air Writing Recognition have aim to make smart technology and provides another method when mobiles are unable to use. It helps to peoples in urgent situation. It helps by writing character h and p in air in front of cctv, which can be recognizes that someone wants to help. Additionally in this project contains voice message, text message to get urgent helps to people. This project provides more safety to peoples which helps to reduce crime rate. The Air Writing Recognition is a union of computer vision object tracking and handwriting recognition. It uses webcam to track character which written by user in air, digits and optical character recognition to differentiate character and digits. After that it uses a Twilio account to make calls and sends message depends upon the character or digit that the system recognized. Recognition is 80% accurate.

S. Thanga, R. Sakthi, et. al. (2023) [3] have proposed Air-Writing in the air with fingers is practically a symbol of pen-based writing. Detecting intended writing among extraneous finger movements entirely irrelevant to letters or words poses a challenge which should be addressed in the common pattern recognition methods. The system writes the exact mean of the motion that is drawn in front of the sensor using Hidden Markov Algorithm and OpenCV. This system draws the exact motion and is also used for virtual key generation. The output is shown like 2D trajectory and this will recognize the word error of the exact drawing. It will also analyze the 6-DOF (Degree Of Freedom) motion of the recognition. The findings of the experiments indicate that the average rate of recognition of digits and numbers is 98.3% accurate.

C. H. Hsieh, Y. Shen Lo, et. al. (2021) [4] have proposed air-writing recognition system is based on deep convolutional neural networks. It has focus on the detection, recognition, and interpretation of characters written in the air using computer vision techniques. The paper introduces two types of datasets. The first data set consists of digits from 0 to 9, with variations in writing directions, including clockwise and anticlockwise motions. This data set contains a total of 20 symbols. The second data set is a pure directional symbol data set, which includes 16 symbols. To enable accurate recognition, the paper proposes several algorithms. These include a robust air writing trajectory acquisition algorithm for capturing the movement of characters in the air. Additionally, a novel hand tracking algorithm, Cam-shift algorithm, skin-pixel detection algorithm, face detection algorithm, gradient descent optimization algorithms, mini batch gradient descent (MBGD) algorithm, and the Adam (Adaptive Moment Estimation) algorithm are employed. These algorithms play a role in learning the deep Convolutional Neural Network (CNN) model. By leveraging these algorithms and utilizing the CNN model, the paper aims to achieve accurate recognition and interpretation of air-written characters.

P. Wang, J. Lin, et. al. (2021) [5] have proposed a gesture air-writing tracking method. It draw characters in a planar area using air-writing characters. The paper introduces smoothing algorithms that are utilized to generate smooth character trajectories. These algorithms aim to enhance the quality and accuracy of the air-written character paths. For more this paper presents a signal model for Frequency-Modulated Continuous Wave (FMCW) radar, radar parameter signals, target detection, and tracking. These components are crucial in enabling the radar system to effectively detect and track the air written characters. By combining the smoothing algorithms and the radar system, the proposed method offers a solution to the challenge of accurately tracking air-written characters in a planar area. The accuracy is 80%.

Y. Luo, J. Liu, et. al. (2021) [6] have proposed in air-writing recognition system it focusing that utilizes Dynamic Time Warping (DTW) for improved accuracy. The system employs a 9-axis Micro Electro Mechanical System (MEMS) to ensure high recognition accuracy and enhance system robustness. In initial tests, the system achieved a total accurate rate of 73% for letters, demonstrating its effectiveness for beginner users. The accuracy for numbers reached 98.2%. And for alphabet letters ("A" to "Z") averaged around 64%.The proposed system offers several advantages, including small hardware size, low latency, and low computational cost. Its small size and enhanced sensitivity contribute to improved performance without compromising individual privacy or being affected by environmental conditions.

III. PROPOSED WORK

To build a model that interprets motions that are written in the air as word.

To create a model that will translate received text into respective language with the help of translator.

To create a personalized login system to gain access to this air writing recognition system.

To build a model that will store written data in a database for future reference and analysis.

To create a model that will convert recognized words into speech, enabling real-time audio output of the translated word.

Creating word recognition and translation system using deep learning involves several essential modules or components. Here are the key modules used in such a system:

Hand Detection: In this model, hand tracking is realized with the detection of the target hand fingertip through image analysis of each video frame OpenCV an open-source computer vision and machine learning library provides enough functions and facilities so that we can process our image with various computer vision algorithms hence the OpenCV library is used to detect, track, and save the trajectory of the target hand fingers as its position shifts throughout the video. The digital representation of each frame is pre-processed.
Data Acquisition: The creation of a mask or subtraction of background depends on several circumstances. Whether there is a separate image of background available, whether there is a lot of noise in the image, variability in lights, and others. Not all objects have strong features which can be immediately recognized. It preprocesses each frame by resizing and reducing Gaussian noise. Then, we construct a binary mask around the object and perform morphological transformations to clean up. We find the contours using an OpenCV algorithm which calculates the hierarchy of contours in the image and compresses it.
Segmentation: It involves dividing a visual input into segments to simplify image analysis. If we want to extract or define something from the rest of the image, e.g., detecting an object from a background, we can break the image up into segments in which we can do more processing on. This is typically called Segmentation. Segments represent objects or parts of objects, and comprise sets of pixels, or “super- pixels”.
Normalization: Changing the range of pixel intensities is a typical task in image processing called normalization. By reducing variances, it primarily aims to transform an input image into a set of values that are more recognizable or normal to the senses.
Data Processing: An image is passed through a series of convolutional layers, which uprooted the features of the image and used it as input to a trained classifier. The classifier compares the input with the pattern and finds out the matching order for input.
CNN Algorithm: Convolutional Neural Networks can be used for a variety of tasks. Due to its convolution and pooling layers, the standard CNN network contains a lot of calculations; it is capable of identifying and classifying objects visually by first processing the images. The CNN contains the layer as follows:

a. Input Layer: Images are read in here. In the hidden layer, feature extraction is performed with a line of convolutional layer. Convolution Layer: This layer is used to extract features. Reducing the spatial dimensionality can help reduce distractions for readers. The output of this layer is a pooled featured map, which helps reduce the visual noise.

b. Activation Layer: This layer introduces non linearity in the system. It’s also called a classification layer, and it’s used as a classifier in the CNN algorithm.

c. Output Layer: This is also called as the final layer, and it’s used as finders for classifying objects in the CNN algorithm.

Conclusion

In conclusion, after conducting research in several applications and taking all the information in the introductory part and came up with a solution. In this paper, we suggest deep CNNs for recognizing air-writing digits and unique direction signals for control similar to a smart TV. With the development of a reliable air-writing trajectory acquisition technique based on a web camera, sophisticated finger-tracking methods are bypassed in favour of simple hand tracking. This technology enables users to write word in the air using hand gestures and also restate them into written text using convolutional neural networks. It captures hand movements in real-time and processes them through the CNN algorithm to recognize the word being written with high delicacy. It able to directly recognize and interpret user input generated by finger movements in the air. Further exploration in this area could lead to advancements in the delicacy, speed, and usability of air writing word recognition and restatement systems, making them more extensively accessible and useful for a variety of operations. Overall, the air writing recognition system holds great potential to revolutionize human-computer interactions and deliver innovative solutions across various domains.

References

[1] Taiki Watanabe, Md. Maniruzzaman, Md. Al Mehedi Hasan, Hyoun-Sup Lee\"2D Camera-Based Air-Writing Recognition Using Hand Pose Estimation and Hybrid Deep Learning Model”, Electronics vol.10, pp.1-14,2023. [2] Koye Navya, Mallela Sowmyah, Shaik Ayesha Amreen, Vemireddy Sravani, Yarrakula Gayathri Devi, Ms. Perli Nava Bhanu,” A Machine Learning Approach for Air Writing Recognition”, vol.9, pp.152631-152640,2023. [3] S. Thanga Ramya, R Sakthi, B Rohitha, D. Praveena, “Air-Writing Recognition System”, IEEE Access, vol.10, pp.142534-142545,2023. [4] Md. Shahinur Alam, Bong-Gyun kang, ki-Chul Kwon, Shariar Md Imtiaz, md. Biddut Hossain, Nam Kim, “An Efficient and Lightweight Trajectory-Based AirWriting Recognition Model Using a CNN and LSTM Network”, Human Behavior and Emerging Technologies, vol.3, pp.1-13,2022. [5] Chaur-Heh Hsieh, You-Shen Lo, Jen-Yang Chen, Sheng-Kai Tang, \"Air-Writing Recognition Based on Deep Convolutional Neural Networks\", IEEE Access, vol.9, pp.142827-142836, 2021. [6] Md. Al Siam, Abu Sayeed, Fuad Al Abir, Md. Al Mehedi Hasan, Jungpil Shin, “Deep Learning Based Air-Writing Recognition with the Choice of Proper Interpolation Technique”, pp.25-32,2021. [7] Pengcheng Wang, Junyang Lin, Fuyue Wang, Jianping Xiu, Yue Lin, Na Yan, Hongtao Xu, \"A Gesture Air-Writing Tracking Method that Uses 24 GHz SIMO Radar SoC\", IEEE Access, vol.8, pp.152728-152741, 2021. [8] Yuqi Luo, Jiang Liu, Shigeru Shimamoto, \"Wearable Air-Writing Recognition System employing Dynamic Time Warping\", 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC), pp.1-6, 2021. [9] Md. Shahinur Alam, Ki-Chul Kwon, Shariar Md Imtiaz, Md Biddut Hossain, BongGyun Kang, Nam Kim, \"TARNet: An Efficient and Lightweight Trajectory-Based Air-Writing Recognition Model Using a CNN and LSTM Network\", Human Behavior and Emerging Technologies, vol.2022, pp.1, 2020. [10] Muhammad Arsalan, Avik Santra, Vadim Issakov, \"Spiking Neural Network Based Radar Gesture Recognition System Using Raw ADC Data\", IEEE Sensors Letters, vol.6, no.6, pp.1-4, 2019.

Copyright

Copyright © 2024 Ms. Neha Ganesh Karande, Ms. Ankita Chandrakant Chorage, Ms. Tejaswini Dilip Chavan, Ms. Vidya Pravin Bansode, Ms. Shifa Shikalgar. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET59099

Publish Date : 2024-03-18

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here