A Comprehensive Review on Indian Sign Language for Deaf and Dumb People

Authors: Kuldeep Raghuvanshi, Maan Pratap Singh, Himanshu Yadav, Bhavyajeet Singh, Upinder Kaur

DOI Link: https://doi.org/10.22214/ijraset.2022.42084

Abstract

In sign language, hand gestures are one of the nonverbal modes used. It is most commonly used by deaf and dumb people who have hearing or speech problems to communicate with other deaf and dumb people or non-deaf people. It is also a piece of software that demonstrates a system prototype capable of automatically recognizing sign language, allowing deaf and dumb people to communicate more effectively with one another and with the general public. Dumb people are permitted to refuse normal communication with other members of society. Ordinary people find it difficult to understand and communicate with them. Deaf and Dumb people must communicate with an interpreter or some form of visual communication. Visual communication is notoriously difficult to learn, and a translator will not always be accessible. understand. The deaf and dumb community\\\'s principal mode of communication is sign language. Because the average person does not understand the syntax or meaning of many of the gestures used in sign language, it is mostly utilized by the families of the deaf and dumb

Introduction

I. INTRODUCTION

People with hearing loss may feel alone, which can harm their social and professional lives. It is costly to hire experienced and skilled translators on demand. Furthermore, normal individuals rarely attempt to acquire sign language in order to interact with the deaf and hard of hearing. Deaf people become increasingly isolated as a result of this. The distance between regular people and the deaf community can be lessened if a computer can be programmed to transform sign language into a written format. A correctly functioning Sign Language Recognition (SLR) system can enable a deaf person to communicate with hearing persons without the need for an interpreter. It can be used to generate voice or text, allowing the deaf to become more self-sufficient.

II. OVERVIEW OF INDIAN SIGN LANGUAGE.

Natural languages that allow us to express ourselves in a variety of ways in everyday life are known as sign languages. Hearing-impaired people can convert letters, words, and sentences from spoken language into hand gestures and human body motions using sign language.

People should be able to communicate with one another. It is, in particular, the hearing impaired's only source of communication. As a result, it can be used by the deaf and dumb as a speech substitute. Not only deaf persons but even hearing parents of deaf children and hearing children of deaf adults, use sign languages.

In terms of phonology, morphology, syntax, and grammar, sign languages are well-structured languages that differ from spoken languages. Several bodily movements are used simultaneously in both spatial and temporal space in sign language. The linguistic features of a sign language differ from those of spoken languages due to the existence of multiple context-altering components, such as the use of facial expressions and head motions in addition to hand gestures.

India is culturally, linguistically, and religiously diverse. India, unlike the United States and Europe, lacks a common sign language. In 1977, Vasishta, Woodward, and Wilson traveled to India with some financing from the National Science Foundation to collect signals for linguistic analysis in four major cities (Delhi, Calcutta, Bombay, and Bangalore). ISL is a separate language having roots in the Indian subcontinent, according to Vasishta et al.[1]. Between 1977 and 1982, Vasishta et al.[1] published four dictionaries of ISL regional varieties as well as various papers. These were allegedly provided to deaf-serving programming across India by the All-India Federation of the Deaf. In ISL, the 26 English alphabets and numbers are shown in Fig 1.

Indian Sign Language is the primary language of the majority of deaf people in India. Deaf people can use their hands, arms, and faces to communicate their thoughts and ideas. Unlike spoken languages, Indian Sign Language uses gestures to express thoughts rather than words.

In ISL, the majority of the alphabets are represented by two-handed gestures.
Both static and dynamic hand gestures are used in the ISL language.
You can also choose from a variety of facial expressions.
One hand moves faster than the other at times in dynamic hand movements.
Many of the motions obstruct the flow of data.
Difficult-to-understand handwritten forms
The hand's position in reference to the body has an impact on the Sign, and ISL employs both global and local hand gestures.

III. LITERATURE SURVEY

Tanuj Bohra et al. [2] proposed a real-time two-way sign language communication system based on image processing, deep learning, and computer vision. To improve outcomes, techniques like hand detection, skin color segmentation, median blur, and contour detection are applied to photographs in the collection. With a huge dataset for 40 classes, the CNN model was able to correctly predict 17600 test images in 14 seconds.

A method for detecting Indian sign language from a live video was proposed by Joyeeta Singha and Karen Das[3]. There are three stages to the system. Skin filtering and histogram matching are part of the preprocessing stage. For feature extraction and classification, Eigenvalues and eigenvectors, as well as Eigen value-weighted euclidean distance, are employed.

Muthu Mariappan H. and Dr. Gomathi V. [4] created a portable real-time sign language recognition system using contour detection and the fuzzy c-means algorithm. The features of the face left, and right hands are used to detect them. The input data is partitioned into a preset number of clusters using the fuzzy c-means approach. A dataset containing ten signers' videos for a variety of phrases and sentences was used to test the system. It was able to reach a 75 percent accuracy rate.

Inspired by LeNet-5[13], Salma Hayani et al. [5] developed a CNN-based Arab sign language identification system. The dataset contained 7869 pictures of Arabic numerals and letters.

A variety of research was conducted, with the number of training sets ranging from 50% to 80%. 90 percent accuracy was attained using an 80 percent training dataset. The author compared the results produced with machine learning algorithms such as KNN (k-nearest neighbor) and SVM to demonstrate the system's performance (support vector machine). This model was designed specifically for image recognition, but it can also be applied to video recognition.

Kshitij Bantupalli and Ying Xie [6] created a video sequence-based American sign language recognition system based on CNN, LSTM, and RNN. To extract spatial characteristics from frames, the CNN model Inception was utilized, while LSTM was used to extract longer temporal dependencies, and RNN was used to extract ephemeral features. The dataset includes 100 distinct signs done by 5 signers, with a maximum accuracy of 93 percent, in several trials with different sample sizes. The sequence is passed into an LSTM for longer temporal dependence. The outputs of the SoftMax layer and the Max-Pooling layer are fed into an RNN architecture to extract temporal information from the SoftMax layer.

Mahesh Kumar [7] designed a method that can differentiate 26 Indian sign language hand movements using Linear Discriminant Analysis (LDA) [14]. The dataset is preprocessed using techniques including skin segmentation and morphological procedures. The otsu algorithm was used to segment the skin. To extract characteristics, linear discriminant analysis is utilized. During the training phase, each gesture is represented as a column vector, which is then normalized to the average gesture. The program finds the eigenvectors of the covariance matrix of normalized gestures. The topic vector is normalized to the average gesture during the recognition phase and then projected into gesture space using the eigenvector matrix. The Euclidean distance between this projection and all other projections is calculated.

Suharjito et al. [8] employed the transfer learning method to build a sign language recognition system based on the I3Dinception [16] model. The public dataset LSA64 was utilized for 10 vocabularies with 500 videos. The training dataset is divided into three parts: 300 films for training, 100 for validation, and 100 for testing. Although the model's training accuracy is acceptable, it has a poor validation accuracy.

Oscar Kellar et al. [9] suggested a hybrid CNN-HMM model for sign language recognition. On three datasets, they tested RWTH-PHOENIX-Weather 2012, RWTH-PHOENIX-Weather Multisigner 2014, and SIGNUM single signer. There is a 10 to 1 ratio between the training and validation sets. Following the completion of the CNN training, a softmax layer is applied, and the outputs are used as observation probabilities in the HMM.

A Neural Network-Based Static Sign Gesture Recognition System was proposed by Parul and Hardeep [10]. There were 2524 pictures in the collection, which were divided into 36 categories. To bring the total number of pictures in the dataset to 17640, data augmentation was used. The images are converted to CSV format and delivered to the ResNet50 network for training after one-hot encoding. Without data enhancement, the model's accuracy is 96.02 percent, while with data enhancement, it's 99.4 percent.

Muhammad Aminur Rahaman et al.[13] use a novel technique to recognize Bengali hand gestures. To recognize the hand in each frame, the system employs cascaded classifiers. It records hand motions using the HIS color model's Hue and Saturation values. The K-Nearest Neighbours Classifier is then used to classify the photos.

The Fourier Descriptor is used by Purva Badhe and colleagues [14] to extract characteristics. The method converts Indian Sign Language gestures into English. The Fourier Series were constructed using the Fast Fourier Transform (FFT) method to represent the boundary points. Because the extracted data is so huge, vector quantization is used to compress it. The information is then saved in a codebook. The code vector is created from motions as compared to an existing codebook for testing reasons, and the gesture is recognized.

IV. PROBLEMS IN THE EXISTING SYSTEMS

The functions of sign language and spoken language are completely contradictory. The basis of Sign Language is space features and iconicity qualities. Hand elements like form, movement, orientation, and location, as well as facial expression and lip motions, are taken into account while interpreting the sign. These variables happen at the same time and are articulated in space. Because a single sign in a Sign Language can represent a single sentence in spoken language, a grammatical and linguistic rule system is necessary.

The capacity to track the signer in a video with a range of backdrop clutter and varying lighting conditions is the major issue encountered by any sign language identification system.

In terms of sign language, it has various issues:

Difficulty with occlusions while completing a sign.
While executing a sign, the signer's posture in front of the camera may change.
Working with a 2D camera results in a loss of depth information.
Because each sign changes in time and place, there may be a shift in its position and orientation.
Same-person speed or person-to-person speed
The issue of coarticulation (the link between preceding and subsequent signs).

V. PROPOSED SYSTEM

After reviewing different proposed solutions for the problem. We tried to create a correctly functioning Sign Language Recognition (SLR) system which solves the problem of the Deaf and Dumb people in the most effective manner.

The system would be a combination of 5 phases: -

Phase 1: Data Pre-Processing: This module creates binary pictures depending on what the camera sees in front of it. To put it another way, the item will be all white, while the backdrop will be completely black. The following technique for modules based on the pixel's regions receives a numerical value in the range of 0 or 1.
Phase 2: Scan a Single Gesture: A motion scanner will be placed in front of the end-user, and the user will be asked to perform a hand gesture. Inside the output window screen based on Pre-Processed module output, a user will be able to observe related labels assigned to each hand motion based on the specified American Sign Language (ASL) standard.
Phase 3: Create a Gesture: A user will give the system the desired hand gesture and then enter anything he or she wants to associate that gesture within the text box at the bottom of the screen. This personalized gesture will be remembered for further usage and identified.
Phase 4: Forming a Phrase: A user will be able to select a delimiter, after which every scanned gesture character will be attached to the previous results until that delimiter is reached, resulting in a stream of coherent words and phrases.
Phase 5: Exporting: The results of the scanned character might be saved as an ASCII standard text file.

VI. SYSTEM ARCHITECTURE

The TensorFlow framework and the Keras API were utilized in the implementation. For user convenience, PyQT5 is also utilized to create the whole front end. User-friendly messages are prompted as a result of the user's actions, as well as which motions match which character window. Additionally, TTS (Text-To-Speech) support includes an export to file module, which means that whatever phrase is formed, the user may listen to it and then quickly export it while also seeing what movements he or she made during the sentence formation.

A. TensorFlow

Tensorflow is an open-source numerical computing software tool. The nodes of the computation graph are defined first, and then the calculation is performed inside a session. TensorFlow is a well-known software library.

B. Keras

Keras is a wrapper for TensorFlow, a Python framework for high-level neural networks. It comes in handy when we need to quickly develop and test a neural network with only a few lines of code. It comprises layers, objectives, activation functions, optimizers, and tools for dealing with pictures and text data, as well as implementations of common neural network pieces including layers, objectives, and activation functions.

The basic system architecture shows how the application functions and shows the features that are available for use

Conclusion

We\'ve tried to reduce some of the major communication issues faced by handicapped persons while using our project/application. We figured out what was preventing them from fully expressing themselves. As a result, the audience on the other side is unable to comprehend what these people are seeking to communicate or the message they intend to convey. As a consequence, everyone interested in learning and communicating in sign languages will profit from this application. A user may quickly learn different gestures and their meanings according to ASL standards with this app. They are able to quickly determine which letter relates to which motion. There is an add-on to this gesture capability in addition to sentence building. If a user understands the gesture\\\'s action, they don\\\'t need to be literate; they may quickly build the gesture and the appropriate assigned character will show on the screen.

References

[1] U. Zeshan, M. Vasishta, and M. Sethna, “DEVELOPMENTAL ARTICLES IMPLEMENTATION OF INDIAN SIGN LANGUAGE IN EDUCATIONAL SETTINGS,” Asia Pacific Disability Rehabilitation Journal, vol. 16, no. 1, 2005, Available: https://pure.mpg.de/rest/items/item_58748_2/component/file_58749 [2] T. Bohra, S. Sompura, K. Parekh and P. Raut, \\\"Real-Time Two Way Communication System for Speech and Hearing Impaired Using Computer Vision and Deep Learning,\\\" 2019 International Conference on Smart Systems and Inventive Technology (ICSSIT), Tirunelveli, India, 2019, pp. 734-739. [3] Joyeeta Singha and K. Das, “Recognition of Indian Sign Language in Live Video,” ResearchGate, Jun. 06, 2013. https://www.researchgate.net/publication/237054175_Recognition_of_Indian_Sign_Language_in_Live_Video. [4] H. Muthu Mariappan and V. Gomathi, “Real-Time Recognition of Indian Sign Language,” undefined, 2019. https://www.semanticscholar.org/paper/Real-Time-Recognition-of-Indian-Sign-Language-Mariappan-Gomathi/bfd77f11debe0f0d401eb34d013328e039f92f3c. [5] S. Hayani, M. Benaddy, O. El Meslouhi, and M. Kardouchi, “Arab Sign language Recognition with Convolutional Neural Networks,” 2019 International Conference of Computer Science and Renewable Energies (ICCSRE), Jul. 2019, doi: 10.1109/iccsre.2019.8807586. [6] Kshitij Bantupalli and Y. Xie, “American Sign Language Recognition using Deep Learning and Computer Vision,” undefined, 2018. https://www.semanticscholar.org/paper/American-Sign-Language-Recognition-using-Deep-and-Bantupalli-Xie/2b0c7196868365fdbeea1742e1731ac43d8a3d6b. [7] M. Kumar, “Conversion of Sign Language into Text,” International Journal of Applied Engineering Research, vol. 13, no. 9, pp. 7154–7161, 2018, [Online]. Available: https://www.ripublication.com/ijaer18/ijaerv13n9_90.pdf [8] Suharjito, H. Gunawan, N. Thiracitta, and A. Nugroho, “Sign Language Recognition Using Modified Convolutional Neural Network Model,” 2018 Indonesian Association for Pattern Recognition International Conference (INAPR), Sep. 2018, doi: 10.1109/inapr.2018.8627014. [9] O. Koller, Sepehr Zargaran, H. Ney, and R. Bowden, “Deep Sign: Hybrid CNN-HMM for Continuous Sign Language Recognition,” ResearchGate, Sep. 19, 2016. https://www.researchgate.net/publication/306321553_Deep_Sign_Hybrid_CNN-HMM_for_Continuous_Sign_Language_Recognition. [10] P. Chaudhary and H. Singh, “Neural Network Based Static Sign Gesture Recognition System,” International Journal of Innovative Research in Computer and Communication Engineering (An ISO, vol. 3297, no. 2, 2007, Available: https://www.rroij.com/open-access/neural-network-based-static-sign-gesturerecognition-system.pdf [11] G. Rao, K. Syamala, P. Kishore, and A. Sastry, “Deep convolutional neural networks for sign language recognition,” undefined, 2018. https://www.semanticscholar.org/paper/Deep-convolutional-neural-networks-for-sign-Rao-Syamala/01aeb61dce9873344fd6353d8c8561b021e011cd. [12] Aditya Dasl, Shantanu Gawde, K. Suratwala, and D. Kalbande, “Sign Language Recognition Using Deep Learning on Custom Processed Static Gesture Images,” undefined, 2018. https://www.semanticscholar.org/paper/Sign-Language-Recognition-Using-Deep-Learning-on-Dasl-Gawde/e510d2312e79bdfabc5c26402022b57286e7a7e9. [13] M. A. Rahaman, M. Jasim, Md. H. Ali, and Md. Hasanuzzaman, “Real-time computer vision-based Bengali Sign Language recognition,” 2014 17th International Conference on Computer and Information Technology (ICCIT), Dec. 2014, doi: 10.1109/iccitechn.2014.7073150. [14] P. C. Badhe and V. Kulkarni, “Indian sign language translator using gesture recognition algorithm,” 2015 IEEE International Conference on Computer Graphics, Vision and Information Security (CGVIS), Nov. 2015, doi: 10.1109/cgvis.2015.7449921.

Copyright

Copyright © 2022 Kuldeep Raghuvanshi, Maan Pratap Singh, Himanshu Yadav, Bhavyajeet Singh, Upinder Kaur. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET42084

Publish Date : 2022-04-30

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here