Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Raji S
DOI Link: https://doi.org/10.22214/ijraset.2023.52711
Certificate: View Certificate
Facial paralysis refers to an inability to move the facial muscles on either or both sides. This happens when the nerve regulating the facial muscles or the area of the brain that controls this nerve is malfunctioning. There are some medical conditions that cause facial paralysis that are transient and treatable, while there are others that can be deadly. It is vital to measure the unevenness of prominent facial features including the eyes, nose, and mouth when diagnosing facial paralysis. Facial paralysis is traditionally determined solely through the physician\'s judgment, based on visual evaluation alone. The medical professionals by hand measure specific locations between the two sides of the person\'s face to determine the extent of facial paralysis before advising appropriate therapy. But these methods are highly prone to errors. The severity level assessment is crucial since it will be used to choose the most effective and appropriate medical care. Therefore a quantitative measure is required to assist in medical diagnosis. In this paper, a number of automatic facial paralysis prediction methods are discussed.
I. INTRODUCTION
Muscles are activated by nerve signals transmitted by the brain. It is a normal process that nobody notices. Paralysis can occur when this process is interrupted. Face paralysis occurs when something affects the facial nerves [1]. It is a medical emergency because stroke is one of the possible reasons for facial paralysis. A physician is able to determine if someone has it by looking at the unevenness between their faces. The patient is directed to make different facial expressions. The physician assigns a score to each of these clinical observations to determine the severity of facial paralysis. This kind of judgment is highly unappealing in the medical industry since it is arbitrary. As a result, testing facial paralysis quantitatively is desirable [2]. A quantitative measure is required to assist in medical diagnosis. The severity of FP needs to be classified and quantified in order to assess the condition. It has been proven that face detection and computer vision methods for face paralysis detection are fast, accurate, and easy to use. This article discusses various face paralysis detection methods based on computer vision, image processing, and deep learning. In Chapter 3, the different detection methods are discussed. The comparison of these approaches is presented in Chapter 4. The conclusion of the review is given in Chapter 5.
II. LITERATURE REVIEW
A technique to effectively extract face key points and identify iris or sclera margins for face paralysis classification has been proposed in [3]. It is based on an ensemble of regression trees. The image dataset contains patients' face images captured while they were directed to make certain expressions on their faces. After dimension alignment and image scaling, images are pre-processed to improve brightness and eliminate unwanted image interferences. Variations in texture and shape that produce large spatial variations are used to classify the deformity of a person's face. The face detector combines the Histogram of Oriented Gradients (HOG) features with a linear classification algorithm, an image pyramid, and sliding window detection. Here, the ratio created by calculating the iris area and the distance between the 2 locations on both sides of the face is used to calculate the face's symmetry. Two steps are involved in classifying the type of facial paralysis: differentiating healthy samples from ill samples and correctly classifying facial palsies.
A deep learning-based model for facial paralysis detection has been implemented in [4]. This method incorporates InceptionNet v3 and DeepID networks for detection. To remove the impact of external factors and make the face images compliant with the IDFNP-CNN architecture, the face images are cropped and reduced in size to 299× 299× 3 pixels. IDFNP concatenated the parameters of the two sections using a concatenation layer in addition to the core elements of Inception-v3 and DeepID. Then the softmax layer completes the identification task. Coupling a face-detection CNN like DeepID with an image-classifying CNN such as Inception- v3 increases accuracy for the FNP dataset and equals to the diagnostic precision of a physician.
For the assessment of facial paralysis, a parallel hierarchy convolutional neural network (PHCNN) with a Long Short-Term Memory (LSTM) network architecture has been presented in [5]. This technique made use of the temporal fluctuation of the image series and region-based uneven face features. This approach reduces the influence of age-based features by automatically learning ROI properties, which include low-level contours and shape properties etc. This framework was trained using discontinuous multi-frame photographs of people with various emotions to learn the spatial characteristics of nerve weakness. The dataset for the system is taken from the YPF database. Important visual cues for the detector were chosen using the AdaBoost method. The spatial characteristics of face parts, such as their shape, profile, location, and distances between conspicuous points, are extracted in order to identify facial deformation. The ground truth for training is the assessment score given by a physician in accordance with HBS.
A face palsy diagnosis technique centered on Active Appearance Models (AAM) and a three-dimensional reconstruction of depth and RGB data from the Kinect 360, has been presented in [6]. The system consists of three steps. They are face detection, feature extraction, and face recognition. The facial images are recorded using Kinect, which generates two different sorts of data: depth image and RGB image. K means clustering is used for face recognition tasks. The face detection technique creates the face mask using the AAM model. There are 121 coordinates on a face mask. Every point's location is represented by a 3D coordinate that includes the x, y, and z components. The coordinates are generated to determine the distance between points of interest that are related to the Y-axis and the X-axis exclusively. Clustering on unstructured information has been selected for analysis of data in this system.
A system based on face mesh learning has been presented in [7]. The suggested approach uses Google's Mediapipe to create a face mesh from the given image after first detecting faces or faces in every picture. Two strategies have been used for the detection task. Recognition of faces, feature extraction, and categorization are included in both approaches. The initial technique employs a conventional machine learning strategy and multiple face distance measurements between markers. Another approach is based on deep learning. This method employs conventional machine learning methodologies in which classification is performed after manual data pre-processing. In order to classify the created mesh, it is then fed to a Mobile Net. All the essential components of an individual's face are provided via face landmark detection. After retrieving the most significant data from an image, different distances between the key points are calculated. Six different traditional classifiers were used in this study for performance comparison. They are Support Vector Machines. XGBoost Learning Algorithm. KNearest Neighbor and Random Forest Classifier. Out of the four classifiers Random Forest classifier gives high accuracy of 94.8%. However, the face mesh-based method obtains an accuracy of 98.93% which is comparatively higher than all other classifier’s performance.
Regional information from face photographs has been used to assess facial paralysis in [8]. In this method, the degree of paralysis is evaluated and classified into three levels: healthy, slight, and strong palsy. The method begins with recognizing faces in the input image, then uses a shape estimator to locate the facial cues. Twenty-nine symmetrical facial features are calculated using these facial cues. The approach primarily examines and measures the variations between the two sides, in particular, the position and arrangement of the facial features. The multi-layer perceptron (MLP), support vector machine (SVM), k-nearest neighbor (KNN), and multinomial logistic regression (MNLR) approaches were utilized for setting up four classifiers in this research. SVM provides better performance than other methods followed by Multilayer perceptron. It was found that segmenting the face into distinct areas makes it easier to identify and evaluate the condition with lesser characteristics. Since the human face is not 100 percent symmetrical on two sides, it is challenging to evaluate facial weakness using symmetry/asymmetry features. Then, a significantly larger number of healthy participants with various levels of facial asymmetry is required in order to accurately assess the effectiveness of this algorithm in clinical practice.
In [9], a unique adaptive local global relational network assesses the degree of facial paralysis. The system consists of four modules and they are an adaptive region learning module, a skip-BiLSTM module and inhibition modeling, and a feature fusion& refining module. The appropriate muscle regions are automatically represented by ALGRNet in terms of various presentations and unique traits. Instead of the predefined muscle regions based on landmarks, we use two simple fully connected networks to adaptively learn the offsets and scaling factors for all AU regions respectively. After face detection, ResNet and Transformer based methods were used for severity prediction. Two linear-based networks have been utilized to obtain the positions and scaling parameters for AU centers. A straightforward landmark localization network has been used to locate the landmarks. The unique multi-branch network is fed with the feature values. The skip-BiLSTM module uses several data transfer methods to identify positive and negative relationships between various AU branches. The performance of the face detection module is evaluated using BP4D and DISFA datasets whereas the performance of facial paralysis severity detection is evaluated using an FP dataset called Para. F1 score (in %) is used for performance comparison. It is found that the proposed system obtains a 75% F1 score compared to ResNet and Transformer based methods.
III. COMPARISON TABLE
TABLE I
No |
Name |
Techniques/Methods |
Dataset |
Performance |
1 |
paraFaceTest: an ensemble of regression tree-based facial features extraction for efficient facial paralysis classification[3] |
Ensemble of Regression Trees (ERT) |
Facial images from 440 individuals (not available due to patient privacy)
|
Sensitivity - 97.48% Specificity -94.91% AUC -97.48% |
2 |
Neurologist Standard Classification of Facial Nerve Paralysis with Deep Neural Networks[4] |
InceptionNet v3 and DeepID networks |
FNP dataset |
Accuracy- 97.5% |
3 |
Region-based parallel hierarchy convolutional neural network for automatic facial nerve paralysis evaluation[5] |
Parallel hierarchy convolutional neural network (PHCNN) combining a Long Short-Term Memory (LSTM) |
YFP Database(Images captured from Youtube videos) |
Accuracy-94.81% Precision- 95.6% Recall-0.948% |
4 |
Facial-Paralysis Diagnostic System Based on 3D Reconstruction[6] |
Active Appearance Models (AAM) |
Image collected by Kinect Sensor |
Accuracy not mentioned |
5 |
Facial Paralysis Recognition Using Face Mesh-Based Learning[7] |
Mediapipe MobileNetV2 |
YouTube facial paralysis database |
Precision-99% Recall- 99% Accuracy-98.93% |
6 |
Automatic Facial Palsy Diagnosis as a Classification Problem Using Regional Information Extracted from a Photograph[8] |
Histogram of Oriented Gradients descriptor Multi-layer perceptron (MLP), Support vector machine (SVM), k-nearest-neighbor (KNN), and multinomial logistic regression (MNLR) |
YouTube Facial Palsy (YFP) CK+ |
Accuracy- 95.61% |
7 |
Automatic Facial Paralysis Estimation with Facial Action Units[9] |
Adaptive Local-Global Relational Network (ALGRNet) Skip-BiLSTM, |
BP4D and DISFA |
F1 score in %- 75.4% |
A neuromuscular condition called facial paralysis (FP) results in facial weakness, impairment of facial emotions, and difficulty in moving the face. Measuring the dissimilarity of prominent facial characteristics is an essential aspect of facial paralysis detection. Various automated methods have been developed over the years to give medical professionals a precise and accurate estimation of the paralysis of the facial muscles. All these systems was having mainly two phases, i.e face detection phase and face paralysis severity level estimation. Most of the systems not only provide a classification of healthy and unhealthy individuals but also categorize them based on the degree of the condition. Knowing the severity level is highly essential as it is required to give appropriate treatment to the patient. Out of the reviewed FP detection systems, face paralysis detection using face mesh-based learning outperforms all other methods by providing 98.93% accuracy in the detection task.
[1] https://www.medicalnewstoday.com/articles/facial-paralysis [2] Ngô, T. H., Seo, M., Matsushiro, N., & Chen, Y. W. (2016). Evaluation of facial paralysis based on spatial features of filtered images. [3] Barbosa, J., Seo, W. K., & Kang, J. (2019). paraFaceTest: an ensemble of regression tree-based facial features extraction for efficient facial paralysis classification. BMC Medical Imaging, 19(1), 1-14. [4] Song, A., Wu, Z., Ding, X., Hu, Q., & Di, X. (2018). Neurologist standard classification of facial nerve paralysis with deep neural networks. Future Internet, 10(11), 111. [5] Liu, X., Xia, Y., Yu, H., Dong, J., Jian, M., & Pham, T. D. (2020). Region based parallel hierarchy convolutional neural network for automatic facial nerve paralysis evaluation. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 28(10), 2325-2332. [6] Khairunnisaa, A., Basah, S. N., Yazid, H., Basri, H. H., Yaacob, S., & Chin, L. C. (2015, May). Facial-paralysis diagnostic system based on 3D reconstruction. In AIP Conference Proceedings (Vol. 1660, No. 1, p. 070026). AIP Publishing LLC. [7] Baig, Z. M., & van der Haar, D. (2023). Facial Paralysis Recognition Using Face Mesh-Based Learning. [8] Parra-Dominguez GS, Garcia-Capulin CH, Sanchez-Yanez RE. Automatic Facial Palsy Diagnosis as a Classification Problem Using Regional Information Extracted from a Photograph. Diagnostics. 2022; 12(7):1528. https://doi.org/10.3390/diagnostics12071528 [9] Ge, X., Jose, J. M., Wang, P., Iyer, A., Liu, X., & Han, H. (2022). Adaptive Local-Global Relational Network for Facial Action Units Recognition and Facial Paralysis Estimation. arXiv preprint arXiv:2203.01800. [10] Jiang, C., Wu, J., Zhong, W., Wei, M., Tong, J., Yu, H., and Wang, L. (2020). Automatic facial paralysis as sessment via computational image analysis. Journal of Healthcare Engineering, 2020. [11] Tiemstra, J. D. and Khatkhate, N. (2007). Bell’s palsy: di agnosis and management. American family physician, 76(7):997–1002. [12] Wang, T., Zhang, S., Dong, J., Liu, L., and Yu, H. (2016). Automatic evaluation of the degree of facial nerve paralysis. Multimedia Tools and Applications, 75(19):11893–11908. [13] Wu,Y., Hassner, T., Kim, K., Medioni, G., and Natarajan, P. (2017). Facial landmark detection with tweaked con volutional neural networks. IEEE transactions on pat tern analysis and machine intelligence, 40(12):3067 3074
Copyright © 2023 Raji S. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET52711
Publish Date : 2023-05-21
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here