Patient Obligation Perceiver Using Eye Gestures

Authors: N Sarika, C. Ravi Varma, K. Gayatri, K. Sushma, M. Varun

DOI Link: https://doi.org/10.22214/ijraset.2023.50686

Abstract

We might have encountered some people who suffer from paralysis, had just undergone surgery, or have strange syndromes (such as Locked-In, Quadriplegia, etc.). They might not be able to speak or move their body. People in these circumstances may find it difficult to communicate, which could have an impact on their health. Consequently, a system that can inform the care taker of their basic needs is necessary. As a result, we can develop a method by which the patient can readily communicate his or her needs through eye gazes. To link these movements with an alphabet that is predefined for each purpose, we can apply the idea of Morse code. This alphabet is then used to find the need in the dataset and create a visual and auditory alert for the nurse/care taker.

Introduction

I. INTRODUCTION

Communication is a major problem for patients, especially for those of whom are paralyzed or had a surgical operation. Patients in such situations are unable to control their muscle movements, making eye contact their only means of communication. Since the 19th century, communication has advanced dramatically, but there has always been a gap in the development of communication for such people. Nearly 15 million people worldwide have speech difficulties, including brain injury, paralysis, and many other illnesses that make communication difficult for those individuals. Several systems already in existence, such those that recognise mouth movements and finger gestures, have been implemented as in [4], but they are impractical for patients who are unable to move their body parts. Hence, eyes can be the only way in order to communicate their needs. One of the first forms of communication is Morse code, which is still in use today. The transmission of Morse code can be done visually by employing reflections or flashlights, but it can also be done covertly by tapping your fingers or even blinking your eyes. A short blink (".") in Morse code represents a dot, whereas a long blink (".") represents a dash ("-"). This acquired Morse code can be converted into a language that people can understand, potentially streamlining the communication process. As a result, a workable system that can implement the aforementioned concept must be created.

II. RELATED WORK

[1] describes an innovative method for creating Morse code that can be understood in English is used. With the aid of the eyes, Morse code is produced. When a person or other user blinks their eyes, a Morse code output of dashes and dots is produced. Eye blink detection uses facial landmark detection with the OpenCV and DLib packages. In India, there are about 21 million persons who have some sort of handicap. These folks have the ability to communicate more gracefully with the world in an instant. Anyone who is proficient in understanding human language can readily understand the user thanks to this system.

The authors of [2] presented a text entry interface built on visual text entering using computer vision and eye gestures. This includes a webcam to recognize the basic eye movements, which are translated into Morse code "dots" and "dashes." The usage of eye blinks can be done with a variety of tools. As far as we are aware, modern computers are equipped with webcams, which we utilize for our project. The web camera's live feed is read, and we use the feedback we receive to determine what the user is trying to say. The interface recognizes intentional eye blinks and interprets them as instructions for the user to choose text. Users are able to use devices that require additional gear and are pricy thanks to our solution.

[3]’s authors utilized a live stream or recorded video of a person blinking their eyelids in a specific order, a Morse code translator can convert Morse code into text and speech in whatever language the user chooses. Morse code input is processed using OpenCV, while Media pipe, a Google API, recognizes faces and maps facial landmarks. These landmarks are then used to map different eye co-ordinates, which are then taken into account to create eye aspect ratios, which in turn determine when the eyes blink. Morse code that has already been loaded in the form of a dictionary will be mapped to the Morse code sent via video or cam and find the Alphabet assigned to it.

In [4], the authors developed a system to implement a communication model so that the patient will be able to communicate with the aid of eyes blinking through Morse code, thus resolving the issue of communication (speech or interacting) for paralyzed patients. Therefore, they created a computer vision program that can recognize and count the length of an eye blink in video streams. Based on the length of the blink, a module may then determine whether the blink is a dash or a dot. The dashes and dots sequence will be saved in an array and converted into regular text after decoding. The pyttsx Python module will then continue to translate this text into audio.

The authors addressed a few illnesses, including Amyotrophic Lateral Sclerosis and Locked-In Syndrome, produce paralysis or motor speech disorders in humans, which can result in voice or speech defects. In this situation, the person becomes unable to communicate. AAC gadgets come to their aid, yet the majority of people find them to be pricey and inaccessible. Currently, we suggest free software that uses open source computer vision to interpret eye blinks to translate messages sent in Morse code, which consists of dots and dashes, into comprehensible English. So, this will be a replacement for AAC gadgets [5].

A low-cost wearable device called Morse Glasses, introduced by authors in [6], used IoT technology and a modified form of Morse code to measure a patient's eye blinks and translate them into generated speech. Any Android-compatible smartphone that has the Morse Glasses mobile application loaded can show and hear a series of Morse-encoded alphabets and sentences in addition to the most often used ones. Patients with motor neuron illnesses like Amyotrophic Lateral Sclerosis (ALS) can effortlessly speak with others, express their needs, and simply live a normal life for less than $30.

The authors’ goal in [7] is to make it easier for those who are unable to speak or perform tasks requiring motor abilities to engage with others. It suggests a method of operation or system based on the Histogram of Oriented Gradients (HOG) and Support Vector Machine (SVM) that can anticipate a lexicon by automatically identifying eye blinks in real-time. Voluntary lengthy blinks assist in transitioning from a counter to a predictive table, while voluntary short blinks are used to cause the counter to halt and pick the lexicon. The system includes an auxiliary input that enables individuals to engage with others with the aid of a gadget. The system can calibrate itself if the user is in the camera's field of view and close by; it does not need prior manual calibration, particular lighting, or previous face detection. With 74% accuracy, the suggested user interface simplifies the process of word detection through blinking.

The writers in [8] used eye blinks to transmit the Morse code in this study. First, they opened the webcam for input using OpenCV. Then they find the face using dLib to get frontal face detector(). Then, they used the dLib shape predictor library to identify the eye area. This library is helpful for identifying many facial landmarks, such the tip of a nose, the edges of an eye or mouth, the corners of an ear, and so on. The time restrictions for detecting dashes and dots have been set at 15 seconds and 30 seconds, respectively. Therefore, in order to produce a dash, one must blink once and then close their eyes for 15 seconds. In a similar manner, one must blink once and then close their eyes for 30 seconds in order to produce a dot. They use the "q" key to close the webcam after providing their input. The dots and dashes are then kept in a list, which is then sent to their machine learning models for character prediction. They will show the character on the screen after the prediction is finished. The flask framework serves as the user interface for this entire process. In order to fill the gap between the machine learning model and the front-end portion of the interface (which is improved by HTML and CSS), where the user attempts to provide input, they employed flask in the interface. Since a user's input is sent to a proxy server, where their machine learning model is kept, and from there it tries to fetch the output and display it on the screen, Flask has played a significant part.

In [9], the authors suggest a method that would allow paralysis sufferers to communicate using Morse-encoded eye blinks that could be read by any webcamequipped device. It offers a unique, practical, and cheap means of communicating the whole English language with a lower learning curve than other approaches. The identified blink pattern is transformed into text for human comprehension. The findings for the deep learning solution on a typical dataset are published and it is compared with current AAC (Augmentative and Alternative Communication) devices and conventional blink detection approaches.

The major goal of [10] is to create a communication system that paralyzed persons can utilize to interact with others. In this project, they created Morse code using people's eyes so that they could communicate with others. To create Morse code, an eyetracking gadget, for instance, could monitor eye movements like blinking. The Arduino microcontroller can later translate this Morse code into regular text. The tool enables eyeonly communication with other people. To translate text or symbols into speech, other programs can be used in conjunction with an eye-tracking equipment. More than 11,000 pre-programmed symbols and photographs are available in one option, and these can be utilized to build new symbols. Text-to-voice software may be used by people who are unable to talk but have greater range of motion.

Zhongxu Hu’s suggested method's general structure in [11] begins with the acquisition of synchronized front-view pictures and gaze from related sensors. The prominent temporal and spatial aspects are primarily taken into account for the scene photographs, and the semantic data is also retrieved for augmentation. An identical-sized probability map to the scene image is created from the look direction. In order to estimate the attention area, these feature maps are normalized and fed into the suggested multiresolution neural network. This study creates a 3D virtual scene based on the HTC VIVE PRO EYE, a virtual reality device that can track the gaze and produce an accurate ground truth, in order to acquire the ground truth.

Manasvi Kotian [12], With the aid of Python 3.6 and the dlib module, the system for covert communication that is based on face and eye identification is being created on OpenCV. Based on the shape predictor file on the dlib module, the system is able to detect facial and eye features. These points are used to produce the output in Morse code and to identify blinks.

The system recognizes short and long blinks, compares them to the recorded Morse code date, and then outputs the necessary information. Increasing the amount of points in the form predictor module will boost the system's accuracy.

The low-resource language of Mongolian, which is spoken by more than 10 million people globally, has a high-quality opensource text-to-speech (TTS) synthesis dataset introduced in this study. The dataset, known as MnTTS, is made up of 8 hours' worth of transcribed audio recordings that were made by a Mongolian announcer who is professional and age 22. The authors in [13] developed a potent non-autoregressive baseline system based on the FastSpeech2 model and the HiFi-GAN vocoder, and tested it using the arbitrary mean opinion score (MOS) and real time factor (RTF) metrics in order to show the dependability of the dataset. According to evaluation results, the robust baseline system that was trained on our dataset achieves MOS above 4 and RTF above 3.30 101, making it suitable for usage in practice.

The research study in [14] proposes a realtime approach for eye blink recognition and voice conversion based on some video and image processing methods. For the purpose of obtaining information about the eyes and facial axes, the Haar Cascade Classifier is utilized for face and eye recognition. In contrast, the same classifier is applied to determine how the eye should be positioned in relation to the face's axis and other senses based on characteristics resembling those of a haar. This suggests an efficient face location-based eye detection system. The position of the observed face is used to propose an effective method for eye detection. Last but not least, a method for detecting eye blinks is developed based on eyelid movement, whether it is open or closed, and it is utilized to operate mobile phones. A relatively inexpensive device that converts eye blinks to audio messages more accurately than the current system has been developed by the study. The eye blinks that are observed can be used in applications like basic utility, S.O.S., and health aid. According to test results, the suggested system provides an overall accuracy of 98% and a detection accuracy of 98% at a distance of 35 cm.

Stefan Treue has unveiled a deep learningbased strategy that makes advantage of the video frames from inexpensive web cams. They acquired facial cues essential to gaze placement using DeepLabCut (DLC), an open-source toolset for collecting points of interest from films, then used a shallow neural network to predict the point of glance on a computer screen. The design in [15] achieved a median inaccuracy of around one degree of visual angle when tested for three extreme positions. The findings establish the groundwork for more research by scientists studying psychophysics or neuro-marketing in the field of deep learning techniques to eye tracking.

In [16], the authors created a computer vision-based method that automatically identifies the letters sent, allowing one to interact with a machine or another person using eye gestures similar to those used in Morse code. With the use of their autonomous computer vision driven technique, we can interpret this visual eye tracking based language. The method makes use of a standard camera to recognize the eye movements that are translated into "dots" and "dashes." The words encoded in Morse code are represented by these "dots" and "dashes." Blink and pupil detectors based on image processing techniques are used. We can detect blinks and the duration of each blink with the use of a blink detector. A blink that lasts between two and four seconds is referred to as a "dot," whereas one that lasts longer than four seconds is called a "dash." The pupil detector aids in detecting pupil movement; if pupils move to the right of a person, that movement is recognized as the next letter, and if they travel to the left of a person, that movement is recognized as the following word. In this method, they create a covert connection between a person and an automatic system by decoding the Morse code that will be transmitted via eye contact. Results of the experiment using a visual scene with no restrictions and some early greeting words show the potential of a system based on automatic eye tracking for non-verbal communication.

Conclusion

The goal of this review was to better understand the patterns in eye gesture detection and Morse code generation. In order to do a better analysis, we reviewed the papers from the previous years to see how it had changed and had continued to change over the years. The objective of this research is to create a Morse code that can be translated into human-readable language by detecting eye gestures.

References

[1] Vijay Jumb, Charles Nalka, Hasan Hussain, Ricky Mathews: “Morse Code Detection Using Eye Blinks” in February 2021, IJTRET. [2] Siddarth, Sanjana, Kavya, Karthik Honwadkar, Puneeth: “Vision Based Text Entry Using Morse code” in July 2022, IRJMETS. [3] Kavitha Reddy Guda, Sainath Cheparthy, Srikar Gangipally, Pranay Goud Iruvuri: “Morse Code Translator using Eye Blinks” in June 2022, IJRASET. [4] S. N. Deshpande, V. A. Deshmukh, G. D. Arjun, H. R. Goskonda, A. R. Butala, D. S. Datar: “Human Computer Interaction through Morse Code” in July 2021, IJRES. [5] Dr. Kranthi Kumar, V. Sai Srikar, Y. Swapnika, V. Sai Sravani, N. Aditya: “ A Novel approach for Morse Code Detection from Eye blinks and Decoding using OpenCV” in May 2020, IJRASET. [6] Navera Tarek, Mariam Abo Mandour, Nada EI-Madah, Reem Ali, Sara Yahia, Bassant Mohamed, Dina Mostafa, Sara EI-Metwally: “Morse glasses: an IoT communication system based on Morse code for users with speech impairments” in June 2021, Springer. [7] Gopal Chaudhary, Puneet Singh Lamba, Harman Singh Jolly, Sakaar Poply, Manju Khari, Elena Verdu: “ Predictive text analysis using eye blinks” in December 2021, ScienceDirect. [8] G Sumanth Naga Deepak, B Rohit, Ch Akhil, D Sai Surya Chandra Bharath and Kolla Bhanu Prakash: “An Approach for Morse Code Translation from Eye Blinks Using Tree Based Machine Learning Algorithms and OpenCV” in 2021, ICASSCT. [9] Srivatsan Sridharan, Nirmal, Sachin, Soundar, Maheswari: “Assistive Technology to Communicate Through Eye Blinks: A Deep Learning Approach” in February 2022, IJCDS. [10] Mr. G. Chandrashekar, Mohim Munnai, Sanjairamanan, Shanoj: “Morse Code to Text Converter for Paralyzed People” in December 2021, IJARSCT. [11] Zhongxu Hu; Chen Lv; Peng Hang; Chao Huang; Yang Xing: “ DataDriven Estimation of Driver Attention Using Calibration-Free Eye Gaze and Scene Features” in February 2022, IEEE. [12] Manasvi Kotian: “Encrypted Communication using Face Detection and Eye Tracking using Morse Code” in March 2022, IJASRM. [13] Yifan Hu, Pengkai Yin, Rui Liu: “MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline” in September 2021. [14] SivaKumar, Ramkumar, Sridhar, Yamuna, Shashi: “Efficient Eye Blink Communication Assistance For Paralyzed Patients” in 2021, DOI: 10.47750/cibg.2021.27.03.208. [15] Niklas Zdarsky, Stefan Treue, Moein Esghaei: “A Deep Learning-Based Approach to Video-Based Eye Tracking for Human Psychophysics” in July 2021, NIH. [16] Krishna Kanth Medichalam, Yaswanth Kumar Vanukuri, V. Vijayarajan, Surya Prasath: “Automatic Morse Code based Communication Recognition with Eye Tracking - A Preliminary Feasibility Study” in 2021.

Copyright

Copyright © 2023 N Sarika, C. Ravi Varma, K. Gayatri, K. Sushma, M. Varun. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET50686

Publish Date : 2023-04-20

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here