Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Dr. G C Bhanuprakash, Meghana Dinesh, Ishaka Fatnani, Nevethitha Prasad, Mansi M
DOI Link: https://doi.org/10.22214/ijraset.2022.44951
Certificate: View Certificate
Communication is the basic medium by which we can interact and understand each other but unfortunately this process can prove to be difficult for the people with disabilities. This project is a small contribution to address this issue and help them communicate better. It uses the concept of neural networks in order to help the people with disabilities communicate better with just their hand gestures. The hand gesture recognizer and actuator system acts as an interface between computer and human using hand gestures. This project is a combination of gesture identification, classification and it displays the message pertaining to the gesture while simultaneously reading the message out loud. The user has to perform a particular gesture, the webcam captures this and identifies and recognizes it against a set of known gestures and displays and vocalizes the gesture. This is a very useful hands-free approach. There could be an emergency situation for them and in order to inform people around them, this system is an effective tool. The system has a set of samples to be trained against and then the trained model is tested against a set of test data.
I. INTRODUCTION
People with disabilities are an integral part of our society. With the advent of science and technology, efforts are being made to develop certain systems that allow accessibility. Hand gestures or sign languages have always been used by people with hearing and speech difficulties to communicate but it is not always easy to find someone who understands sign language and can translate it. Human-computer interaction system can be installed and used anywhere possible. Gestures are basically the physical action performed by a person to convey a message. Hand gesture is a technique of non-verbal communication for human beings.
Object detection is a computer technology related to computer vision and image processing that deals with detecting instances of semantic objects of a certain class (such as humans, buildings, or cars) in digital images and videos. Well-researched domains of object detection include face detection and Pedestrian Detection. This technology has applications in many areas of computer vision, including image retrieval and video surveillance. In recent times a lot of papers have been published that make use of CNN to build a hand gesture recognition system but there are a very few of them connecting the recognized hand gesture to perform useful tasks. Our motive through this project was not only to build a hand gesture recognizer but also an actuator to perform required tasks such as displaying of an alert message dedicated to that particular hand gesture and also vocalizing the message in order to grasp the attention of the people around them.
This project is a small contribution to addressing this issue and assisting people with disabilities in communicating more effectively. It employs the concept of neural networks to assist them in communicating short messages using only hand gestures. An interface between the computer and the user is provided by the hand gesture recognizer and actuator system. This project combines gesture recognition with message classification; based on the gesture, it displays and vocalises the appropriate message. The user must make a specific gesture that the webcam records, recognises against a database of recognised gestures, and then displays the message that goes with it. This hands-free technique is incredibly practical. A set of training samples are sent to the system, and the trained model is then tested using a set of test data. Once a suitable accuracy is achieved, the final step is making use of TensorFlow to load the model. Media Pipe is then used to recognize the gesture and feed it to the model. The webcam is then activated wherein the real time video is captured and the message corresponding to the gesture is displayed as well as vocalized.
II. LITERATURE SURVEY
Debashish Bal et al. in his paper includes a hand glove system for the deaf and mute patients which uses their hand gesture pattern for any communication. The hardware used by the authors to design the gloves includes flex sensor, to measure the quantity of bending or deflection. The MPU 6050 records acceleration, orientation, and velocity using a 3-axis gyrometer and 3-axis accelerometer. Flex sensors are connected to MPU 6050 with Raspberry Pi 3D. Several dynamic hand movement recordings are obtained 20 times from 88 individuals to provide a total of 1760 samples for the purpose of identifying gestures. Finally, the model is evaluated using 23 physically challenged individuals, each of whom performs the hand gesture 20 times, with an overall accuracy rate of 92.7 percent. Although this glove system has three flex sensors, five flex sensors can be employed to get superior performance. [1]
Zehao Wang et al. proposes in their paper a block sparse based time-frequency (TF) feature extraction method for recognizing dynamic hand gestures using millimetre-wave radar sensors. The author uses the idea that the micro-Doppler features in hand-gestures are more clustered than independently scattered points. This radar-based hand gesture recognition consists of two steps, namely data representation and feature extraction, and classification. There are three methods used, they are, Block Sparse Model is applied to model the TF distribution of gestures, then extraction of TF features is done by block orthogonal matching pursuit (BOMP) algorithm, and finally kNN classifier is used to classify the extracted block features. The dataset consists of 400 hand gestures, each subject repeating a specific gesture for 25 times. The highest accuracy is 89.5% in the classification stage. [2]
Bhargav Jethwa et al. in his paper include the communication between hearing and speech impaired people and non-disabled people. Communication is carried out through the mobile phone of the specially abled person in the form of audio and text notifications. The authors have made use of IMU sensors to record temporal signals. These signals are transmitted to the mobile phone of the disables using Bluetooth 2.0. Here, the signals are pre-processed to remove the noise using Kalman filter. The pre-processed data is used for training in the 1D CNN model. The output of this proposed system is in the form of audio and text of the identified gesture in the di-abled mobile.
The training accuracy achieved is 98.06% and the testing accuracy is 97.96%. The time taken by the model to predict the pattern and display result once the data is reached into the phone is 0.56sec. [3]
Mohd. Aquib Ansari et al. in his paper use a robust approach of dynamic hand gesture recognition for human machine interaction (HMI) systems. Web cam is used to capture images in real time for the working of the system. In RGB and HSV colour space, skin colour modelling is done for segmenting geometrical approximation of the hand using contours. The hand may be found in the image using ROI.
Tracking fingers is done using contour defects, while tracking hands is done using centroid tracking. This method employs eight different types of dynamic hand movements, and the system has effectively recognised each one of them. There are two interface technique types, glove based, and vision based. The methodology proposed was by using Median filter, Skin Colour Segmentation, Morphological Operation which then sends pre-processed images for detection of hand. The accuracy is up to 95% for dynamic hand gesture recognition. [4]
Dr. P. Muralidhar et al. proposes a real time system in the paper [5] can be customized to user-specific dynamic hand movements and can carry out certain tasks, especially for the motor impaired people. The paper makes use of Cascade detection algorithm in order to track the movements of the hand and to trace the path and it also uses Siamese neural network for the purpose of customization and recognition of the gestures. When operating the system for the first time, the user has to record the desired number of gestures for performing various operations respectively. Once the gestures are recorded successfully, the user can perform any of those gestures and the matching of the images takes place between the already stored images and the image of the traced trajectory of the current gesture. The gesture which gives the highest probability on matching is then performed.
M. I. N. P. Munasinghe designed and developed a system in his paper [6] that makes use of Motion History Images (MHI) and Feedforward Neural Networks for recognizing gestures in front of a web camera real time. Firstly, Gaussian mixture-based background/forehead segmentation algorithm and then median filtering is applied, then binary thresholding with Otsu’s binarization and then the processed frames are merged and using a developed algorithm based on structural similarity to generate cumulative motion history image. To classify the gestures the author has used a stochastic gradient-based optimizer. A Neural Network is used to determine the probabilities for each category, if the category having the highest probability surpasses 0.8, it can be identified as the correct gesture. The experiment results in real time using motion history images with the neural networks is shown to give considerable satisfactory results.
III. METHODOLOGY
Our first step is to capture the real-time video though the web camera. Tis real-time video is fed to the model after being put through a few pre-processing steps to ensure the model has a clear image of the gesture for classification or detection.
A. Pre-Processing Steps: Some of the pre-processing steps we are performing are:
A high-fidelity hand and finger tracking solution is MediaPipe Hands. It uses machine learning (ML) to extrapolate 21 3D hand landmarks from a single frame. Our solution delivers real-time performance on a cell phone, and even scales to several hands, unlike existing state-of-the-art systems, which mostly rely on powerful desktop environments for inference. MediaPipe Hands uses an ML pipeline made up of several interconnected models: an orientated hand bounding box is produced by a palm detection model that uses the entire image as input. a hand landmark model that runs on the palm detector-cropped picture region and outputs highly accurate 3D hand keypoints.
IV. RESULTS
The image is being detected with a 99% accuracy. The lag caused by the load on the system is the only drawback. The model detects 10 gestures and can be trained to add more custom gestures to the model. The self-training ability makes it more user friendly.
Gesture Recognition provides the most important means for non-verbal interaction among the people especially for people with hearing and speech impairments. Different applications of hand gesture recognition have been implemented in different domains. This project is outlines a hand gesture recognition system that is designed to be capable of recognizing hand gestures in real time. The aim is to compare different algorithms in training and testing approaches to discover the best algorithm to extract and classify hand gesture recognition. The ability of neural nets to generalize makes them natural for gesture recognition. Therefore, the video captured in real time is used to recognize the customized gesture for the alert message. The audio feature of the system allows people around to be alerted as to what message is being conveyed by the gesture. The future of such a software is to reduce the lag caused due to the addition of the voiceover. This would make the system more pleasing to the user as well as increase the efficiency. A bigger development would be for the software to send alert messages to emergency services if necessary. This feature, if added, could be lifesaving in many circumstances.
[1] Debasish Bal et al. “Dynamic Hand Gesture Pattern Recognition Using Probabilistic Neural Network”, 021 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS) | 978-1-6654-4067-7/21/$31.00 ©2021 IEEE | DOI: 10.1109/IEMTRONICS52119.2021.9422496 [2] Zehao Wang et al. “Feature Extraction for Dynamic Hand Gesture Recognition Using Block Sparsity Model”; 2021 IEEE/MTT-S International Microwave Symposium - IMS 2021 | 978-1-6654-0307-8/21/$31.00 ©2021 IEEE | DOI: 10.1109/IMS19712.2021.9574796. [3] Dr. P. Muralidhar et al. “Customizable Dynamic Hand Gesture Recognition System for Motor Impaired people using Siamese neural network”; ISBN: 978-1-5386-8448-1/19/$31.00 ©2019 IEEE. [4] Abhi Zanzarukiya et al. “Assistive Hand Gesture Glove for Hearing and Speech Impaired”, Proceedings of the Fourth International Conference on Trends in Electronics and Informatics (ICOEI 2020) IEEE Xplore Part Number: CFP20J32-ART; ISBN: 978-1-7281-5518-0. [5] Mohd. Aquib Ansari et al” An Approach for Human Machine Interaction using Dynamic Hand Gesture Recognition”; 2019 IEEE Conference on Information and Communication Technology (CICT); 978-1-7281-5398-8/19/$31.00 ©2019 IEEE. [6] M. I. N. P. Munasinghe “Dynamic Hand Gesture Recognition Using Computer Vision and Neural Networks”; 2018 3rd International Conference for Convergence in Technology (I2CT); 978-1-5386-4273-3/18/$31.00 ©2018 IEEE.
Copyright © 2022 Dr. G C Bhanuprakash, Meghana Dinesh, Ishaka Fatnani, Nevethitha Prasad, Mansi M. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET44951
Publish Date : 2022-06-27
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here