Innovative Hand Sign to Text-and-Speech Conversion System

Authors: Atharva Bhosale, Gaurav Nikalje, Nikhil Patil, Kaushal Chawale, Prof. Swapna V. Tikore

DOI Link: https://doi.org/10.22214/ijraset.2024.64846

Abstract

In present scenario, it is difficult for people who are unable to speak due to various reasons. It may not be a permanent condition, as muteness and deafness can be caused or manifest due to several different phenomena, such as physiological injury, illness, medical side effects, psychological trauma, developmental disorders, or neurological disorders. People who are suffering from this typically use the Sign Language to communicate with other people. However, it is not possible for everyone to be acquainted with communication using sign language. This project focuses on developing a solution which can be used by regular people to understand the people having mutism. Hand Sign Recognition Systems have evolved significantly with advancements in machine learning and deep learning techniques. These systems are designed to recognize gestures or signs, making them useful in various applications such as sign language interpretation, human-computer interaction, and virtual reality. This paper gives an overview of the current state of hand sign recognition systems, highlighting the techniques used, challenges, and possible future directions. The focus of the paper is on deep learning-based methods and their impact on the accuracy and usability of hand sign recognition in real-time applications [1]

Introduction

I. INTRODUCTION

When a person having mutism (inability to speak) or deafness (inability to hear) wants to communicate with people for his daily needs or for a normal communication, it can be very difficult for them to make others understand their thoughts.

Our application focuses on filling this communication gap between these two groups by providing an application solution which can be used by regular people. This application focuses on an easy-to-use UI design which can be used by anyone making communication seamless. It comes with a feature that changes the ‘sign’ to an audio output. The application captures live video and provides an audio output for the sign.[1]

American sign language is a predominant sign language Since the only disability D&M people have been communication related and they cannot use spoken languages hence the only way for them to communicate is through sign language. Communication is the process of exchange of thoughts and messages in various ways such as speech, signals, behavior and visuals. Deaf and dumb(D&M) people make use of their hands to express different gestures to express their ideas with other people. Gestures are the nonverbally exchanged messages and these gestures are understood with vision. This nonverbal communication of deaf and dumb people is called sign language. In our project we basically focus on producing a model which can recognize Fingerspelling based hand gestures to form a complete word by combining each gesture.[2]

II. LITERATURE SURVEY

1) Muthu Mariappan H, Dr. Gomathi Vs. Real-time recognition of Indian Sign Language. 2nd International Conference on Computational Science in Data Science (ICCIDS-2019), 978-1-5386-9471-8.

In this study, Muthu Mariappan H, Drs. Gomathi V presented a real-time algorithm for recognizing Indian Sign Language (ISL). Using a vision-based approach with OpenCV's skin segmentation followed by gesture and fuzzy c-means clustering algorithm for gesture recognition, the system is designed to facilitate communication between hearing-impaired individuals and is achieved accuracy 75% in recognizing 40 ISL words. However, studies have highlighted the challenges of time and resolution in big data. Future work suggests combining convolutional neural networks (CNN) with recurrent neural networks (RNN) to improve detection accuracy and efficiency.

2) Zhen Zhang, Ziyi Su, Ge Yang. Real-time Chinese Sign Language Recognition based on Artificial Neural Networks. Proceedings of the IEEE International Conference on Robotics and Biomimetics, 978-1-7281-6321-5.

Jen Zhang, Jiyi Su, and Ge Yang propose a Chinese Sign Language (CSL) recognition model using surface electromyographic (sEMG) signals and artificial neural networks (ANN) The system uses MYO armband to acquire data and Provides sliding window method is used for feature extraction with extraction. It achieves 88.7% accuracy in recognizing 15 CSL gestures, with a response time of 300ms, highlighting its real-time capabilities but noting challenges such as discrete gestures with the same neural motion. The study suggests further modifications in signal processing to increase classification accuracy.

3) Weizhe Wang, Hongwu Yang. Towards Realizing Sign Language to Emotional Speech Conversion by Deep Learning. 12th International Symposium on Chinese Spoken Language Processing (ISCSLP), 978-1-7281-6994-1.

Weizhe Wang and Hongwu Yang suggest a framework for converting sign language into emotional speech the use of deep mastering strategies. They combine a gesture popularity model and a facial expression recognition model the use of a Deep Convolutional Generative Adversarial Network (DCGAN) and an emotional speech synthesis model the usage of a hybrid Long Short-Term Memory (LSTM) community. The device achieves 93.96% accuracy in gesture recognition and ninety six.01% in facial features recognition. Despite its high overall performance, the observe identifies boundaries in adapting to diverse emotional expressions throughout exceptional users. Future work objectives to improve adaptability via increasing training statistics with greater numerous emotional expressions.

III. METHODOLOGY

This project aims to develop a system that allows people with speech impairments to communicate effectively with non-signers. The system will recognize hand gestures made by the user and convert them into corresponding audio output, bridging the communication gap. The methodology combines hardware and software components for gesture recognition, machine learning for accuracy, and real-time processing for practical usability. Below is a detailed outline of the methodology:

Problem Definition and Objective
System overview
Data Collection and Preprocessing
Hand Gesture Recognition Model
Hardware and Sensor
Real-time Gesture Recognition
Text-to-speech conversion
User Interface
Testing and Validation

This methodology outlines the development of a system that converts hand signs into audio output using advanced gesture recognition techniques. By leveraging deep learning, sensor-based hardware, or computer vision, the project aims to create an accessible, real-time communication tool for individuals with speech impairments. The proposed system can significantly improve the ease with which people with mutism communicate with non-signers in daily life

IV. USE CASE DIAGRAM & SYSTEM DIAGRAM

V. FUTURE WORK

Real-Time Performance Optimization: Improving the gadget's processing velocity and efficiency to ensure low latency at some point of recognition and conversion is important for actual-time applications.
Personalization and User Adaptation: Developing user-specific adaptation mechanisms, where the gadget can be skilled or calibrated for an man or woman person’s gesture fashion, can improve recognition accuracy.
Support for Multiple Sign Languages: Expanding the system to recognize a couple of signal languages, inclusive of American Sign Language (ASL), British Sign Language (BSL), and Indian Sign Language (ISL), could make the gadget versatile and beneficial for a wider target market.
Emotion Recognition and Expression in Speech Output: Including facial features analysis and emotion recognition along hand gesture popularity ought to enable the device to bring not handiest the words however additionally the emotional tone of the signal language person, ensuing in greater expressive and accurate speech synthesis.

Conclusion

In this paper various methods are discussed for gesture recognition, these methods include from Neural Network, HMM, fuzzy c-means clustering, besides using orientation histogram for features representation. For dynamic gestures HMM tools are perfect and have shown their efficiency especially for robot control. NNs are used as classifier and for capturing hand shape in. For features extraction, some methods and algorithms are required even to capture the shape of the hand as in applied Gaussian bivariate function for fitting the segmented hand which is used to minimize the rotation affection. The selection of specific algorithms for recognition depends on the application needed. In this work application areas for the gestures system are presented. Explanation of gesture recognition issues, detailed discussion of recent recognition systems are given as well. Summary of some selected systems are listed as well.

References

[1] Muthu Mariappan H, Dr. Gomathi V. (2019). “Real-Time Recognition of Indian Sign Language,” Second International Conference on Computational Intelligence in Data Science (ICCIDS-2019), 978-1-5386-9471-8. [2] Zhen Zhang, Ziyi Su, Ge Yang. (2019). “Real-Time Chinese Sign Language Recognition Based on Artificial Neural Networks,” Proceedings of the IEEE International Conference on Robotics and Biomimetics, 978-1-7281-6321-5. [3] Mansi Patel, Anjali Deshmukh, Arjun Sethi. (2021). “Sign Language Translation Systems for Hearing/Speech Impaired People: A Review,” International Conference on Innovative Practices in Technology and Management (ICIPTM), 978-1-6654-2530-8. [4] Weizhe Wang, Hongwu Yang. (2021). “Towards Realizing Sign Language to Emotional Speech Conversion by Deep Learning,” 12th International Symposium on Chinese Spoken Language Processing (ISCSLP), 978-1-7281-6994-1. [5] G. R. S. Murthy, R. S. Jadon. (2009). “A Review of Vision Based Hand Gestures Recognition,” International Journal of Information Technology and Knowledge Management, vol. 2(2), pp. 405 410. [6] P. Garg, N. Aggarwal and S. Sofat. (2009). “Vision Based Hand Gesture Recognition,” World Academy of Science, Engineering and Technology, Vol. 49, pp. 972-977. [7] Fakhreddine Karray, Milad Alemzadeh, Jamil Abou Saleh, Mo Nours Arab, (2008) .“Human Computer Interaction: Overview on State of the Art”, International Journal on Smart Sensing and Intelligent Systems, Vol. 1(1).

Copyright

Copyright © 2024 Atharva Bhosale, Gaurav Nikalje, Nikhil Patil, Kaushal Chawale, Prof. Swapna V. Tikore. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET64846

Publish Date : 2024-10-27

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here