Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Akash Kamble, Jitendra Musale, Rahul Chalavade, Rahul Dalvi, Shrikar Shriyal
DOI Link: https://doi.org/10.22214/ijraset.2023.51981
Certificate: View Certificate
: Sign language is a form of communication that uses hand sign and gestures to convey meaning. we present a new approach to converting sign language into text format. Our system is designed to enable deaf and mute people to communicate with others in a more accessible and convenient way. The proposed method uses computer vision and deep learning methods to recognize hand gestures and translate them into appropriate text. The system was built using a combination of key point detection using MediaPipe, data pre-processing, label, feature generation and LSTM neural network training. This work has the potential to significantly improve communication for deaf and dumb individuals and reduce the barriers to communication with the rest of the world. The system uses key point detection algorithms such as MediaPipe to identify hand gestures and a Lstm model to translate them into corresponding text output. The data collected from the sign language is pre-processed and then used to train an LSTM neural network to accurately recognize gestures and produce text output. This method of conversion not only helps deaf and hard of hearing individuals communicate with the hearing population, but also serves as an assistive tool for individuals who are trying to learn sign language. Overall, the proposed solution has the potential to greatly improve communication and reduce barriers for deaf and hard of hearing individuals
I. INTRODUCTION
This Effective communication is essential in all aspects of life, and it is especially important for individuals who are deaf or hard of hearing. With the rising number of people suffering from hearing loss, it is crucial to find ways to bridge the communication gap between the hearing and non-hearing population. To address this issue, we present a new system for converting Sign Language into text format using computer vision and machine learning techniques. This system aims to provide an efficient and accessible solution for deaf and hard of hearing individuals to communicate with the hearing population.In the today’s world, Communication is always having a great impact in every domain and how it is considered the meaning of thoughts and expressions that attract the researchers to bridge this gap for normal and deaf people. According to World Health Organization, by 2050 nearly 2.5 billion people are projected to have some degree of hearing loss and at least 700 million will require hearing rehabilitation. Over 1 billion young adults are at the risk of permanent, avoidable hearing loss due to unsafe listening practices. Sign languages vary among regions and countries, with Indian, Chinese, American, and Arabic being some of the major sign languages in use today. This system focuses on Indian Sign Language and utilizes the Media Pipe Holistic Key points for hand gesture recognition. The system uses an action detection model powered by LSTM layers to build a sign language model and predict the Indian Sign Language in real-time. The use of cutting-edge technologies and efficient algorithms makes this system a valuable tool for improving communication between deaf and hard of hearing individuals and the rest of the world. It is difficult to finding a sign language translator for converting sign language every time and everywhere, but electronic devices interaction system for this can be installed anywhere is possible. Computer vision is one of the emerging frameworks in object detection and is widely used in various aspects of research in artificial intelligence. Sign language is categorized in accordance with regions like Indian, Chinese, American and Arabic. This system introduces efficient and fast techniques for identifying the hand gestures representing sign language meaning. In this system we will extract the Media Pipe Holistic Key points, then build a sign language model using an Action detection powered by LSTM layers. Then Predict Indian sign language in real time.
II. LITERATURE REVIEW
10. The paper describes a new image preprocessing and feature extraction approach for Sign Language Recognition (SLR) based on Hidden Markov Models (HMMs). The approach uses a multi-layer Neural Network to build an approximate skin model using the Cb and Cr color components of sample pixels. Gesture videos are split into image sequences and converted into the YCbCr color space. By using a multi-layer Neural Network to build an approximate skin model, this approach can accurately identify and extract the hand area in each image. By using a multi-layer Neural Network to build an approximate skin model, this approach can accurately identify and extract the hand area in each image.
III. METHODOLOGY
A. Data Collection
To develop the Sign Language to Text Conversion System, a large and diverse dataset of hand gestures representing Indian Sign Language is required.
This dataset is collected with the help of a webcam and the Media Pipe library. The Media Pipe library provides the tools to track the hand gestures in real-time and place key points on the user's hand. The webcam captures the hand gestures and stores them as data samples for the dataset.
The collected data is used to train and test the machine learning model, which is responsible for recognizing the hand gestures and converting them into text. To ensure the robustness and accuracy of the system, it is important to collect a diverse and representative dataset, covering a wide range of hand gestures and variations in hand movements. The data collection process is ongoing, and the dataset is continually updated to ensure that it accurately reflects the Indian Sign Language. With the help of the Media Pipe library and webcam, we can collect high-quality data samples to build a robust and accurate Sign Language to Text Conversion System.
B. Data Pre-Processing
Pre-processing the hand gestures images is an important step in the development of the Sign Language to Text Conversion System. The purpose of pre-processing the images is to prepare them for the machine learning model, making it easier for the model to recognize the hand gestures and translate them into text.
During the pre-processing step, the images of hand gestures are resized, normalized, and transformed to make them suitable for input into the machine learning model. The images are resized to a consistent size, so that the model can easily process them. The normalization step is performed to remove any inconsistencies in the lighting, background, or colour of the images, which can negatively impact the performance of the model.
In addition to resizing and normalization, the images may also undergo a transformation process, such as cropping or rotation, to ensure that the model has a consistent view of the hand gestures. This helps to reduce the variance in the data and makes it easier for the model to recognize the hand gestures.
Once the pre-processing step is complete, the images are ready to be used for training and testing the machine learning model. The pre-processed images provide the model with the information it needs to learn the relationship between the hand gestures and the corresponding text, allowing it to recognize and translate the hand gestures into text with high accuracy. With the help of pre-processing, the Sign Language to Text Conversion System becomes a powerful tool for helping deaf and dumb persons to communicate with others.
???????C. Labelling Text Data
Labelling the hand gestures is an important step in the development of the Sign Language to Text Conversion System. In this step, each hand gesture in the dataset is assigned a label representing the word or phrase it represents. This labelling process is crucial as it provides the machine learning model with the information it needs to recognize and translate the hand gestures into text.
The labels for the hand gestures are based on the Indian Sign Language and are created in accordance with the standard terminology and grammar used in the language. The labels are assigned by an expert in Indian Sign Language, who ensures that the labelling is consistent and accurate. The labelling process is performed manually, but with the help of computer vision techniques, it can also be automated to a certain extent. Once the hand gestures are labelled, they are ready to be used for training and testing the machine learning model.
The labelled data provides the model with the information it needs to learn the relationship between the hand gestures and the corresponding text, allowing it to recognize and translate the hand gestures into text with high accuracy. With the help of appropriate labelling, the Sign Language to Text Conversion System becomes a powerful tool for helping deaf and dumb persons to communicate with others.
???????D. Training and Testing
In the training phase, the model is fed the pre-processed images of hand gestures along with the corresponding text labels. The model uses this information to learn the relationship between the hand gestures and the text, updating its parameters as it processes more data.
The goal of the training phase is to train the model to accurately recognize and translate the hand gestures into text.
In our project on "Conversion of Sign Language to Text," we have implemented a multi-layered LSTM (Long Short-Term Memory) model to effectively convert sign language gestures into textual representations. The LSTM model consists of three LSTM layers and three Dense layers having activation function as RELU and the output layer having activation function as SoftMax, each contributing to the understanding and interpretation of the sequential nature of sign language.
The addition of multiple LSTM layers allows for the model to learn and capture increasingly complex patterns and dependencies present in sign language gestures. Each LSTM layer in the network takes in a sequence of inputs, processes them through its memory cells, and outputs a hidden state that carries information forward to the next layer.
The conversion of sign language to text involves several steps and technologies, including the use of a camera, the Mediapipe library, feature extraction, data points, image matching, RNN algorithm, gesture verification, and gesture classification. Here's a breakdown of how each component works together in the process:
Overall, the process of converting sign language to text requires a combination of computer vision, machine learning, and natural language processing technologies to accurately detect, recognize, and translate sign language gestures into text messages and the developed model was able to detect various hand gestures and signes with an accuracy of 96.66%.
V. ACKNOWLEDGMENT
We are delighted to present the paper on "Conversion of Sign Language to Text." We would like to seize this moment to express our heartfelt gratitude to Prof. Jitendra Musale, our internal guide, for his unwavering assistance and invaluable guidance throughout the project. His support has been instrumental in our progress, and we are truly thankful for his contributions. We would also like to extend our deepest appreciation to Dr. Sunil Thakare, the principal of ABMSP's Anantrao Pawar College of Engineering Research, for his continuous support and encouragement. Additionally, we are grateful to Prof. Rama Gaikwad, the project head at ABMSP's Anantrao Pawar College of Engineering & Research, for their indispensable guidance, insightful suggestions, and for providing us with the necessary infrastructure to carry out our project effectively.
The project is focused on solving the problem of deaf and dumb people. This system will automate the hectic task of recognizing sign language, which is difficult to understand for a normal person, thus it reduces the efforts and increases time efficiency and accuracy. Using various concepts and libraries of image processing and fundamental properties of image we trying to develop this system. This paper represented a visioned based system able to interpret hand gestures from the sign language and convert them into text. The proposed system is tested in the real-time scenario, where it was possible to prove that obtained RNN models were able to recognize hand gestures. As future work is to keep improving the system and make experiments with complete language datasets.
[1] S. M Mahesh Kumar, “Conversion od Sign Language into Text,” International Journal of Applied Engineering Research ISSN 0973-4562 Volume 13, Number 9, 2018. [2] Kohsheen Tiku, Jayshree Maloo, Aishwarya Ramesh, Indra R, “Real-time Conversion of Sign Language to Text and Speech,” 2020 Second International Conferenceon Inventive Research in Computing Applications, Coimbatore, India, 2020, pp. 346-351. [3] C. Uma Bharti, G. Ragavi, K. Karthika \"Signtalk: Sign Language to Text and Speech Conversion,\" 2021 International Conference on Advancements in Electrical, Electronics, Communication, Computing and Automation (ICAECA), Coimbatore, India, 2021, pp. 1-4, doi: 10.1109/ICAECA52838.2021.9675751. [4] M. Zamani and H. R. Kanan, \"Saliency based alphabet and numbers of American sign language recognition using linear feature extraction,\" 2014 4th International Conference on Computer and Knowledge Engineering (ICCKE), Mashhad, Iran, 2014, pp. 398-403, doi: 10.1109/ICCKE.2014.6993442. [5] A. Saxena, D. K. Jain and A. Singhal, \"Sign Language Recognition Using Principal Component Analysis,\" 2014 Fourth International Conference on Communication Systems and Network Technologies, Bhopal, India, 2014, pp. 810-813, doi: 10.1109/CSNT.2014.168. [6] J. Zhang, W. Zhou, C. Xie, J. Pu and H. Li, \"Chinese sign language recognition with adaptive HMM,\" 2016 IEEE International Conference on Multimedia and Expo (ICME), Seattle, WA, USA, 2016, pp. 1-6, doi: 10.1109/ICME.2016.7552950. [7] D. Guo, W. Zhou, M. Wang and H. Li, \"Sign language recognition based on adaptive HMMS with data augmentation,\" 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 2016, pp. 2876-2880, doi: 10.1109/ICIP.2016.7532885. [8] K. Grobel and M. Assan, \"Isolated sign language recognition using hidden Markov models,\" 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation, Orlando, FL, USA, 1997, pp. 162-167 vol.1, doi: 10.1109/ICSMC.1997.625742. [9] D. Guo, W. Zhou, M. Wang and H. Li, \"Sign language recognition based on adaptive HMMS with data augmentation,\" 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 2016, pp. 2876-2880, doi: 10.1109/ICIP.2016.7532885. [10] D. Van Hieu and S. Nitsuwat, \"Image Preprocessing and Trajectory Feature Extraction based on Hidden Markov Models for Sign Language Recognition,\" 2008 Ninth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, Phuket, Thailand, 2008, pp. 501-506, doi: 10.1109/SNPD.2008.80.
Copyright © 2023 Akash Kamble, Jitendra Musale, Rahul Chalavade, Rahul Dalvi, Shrikar Shriyal. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET51981
Publish Date : 2023-05-10
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here