Realtime Facial Emotion Detection

Authors: Ankit Singh, A. Sai Nishwan, N Gautham, K. Sudhakar Reddy

DOI Link: https://doi.org/10.22214/ijraset.2024.59630

Abstract

A crucial component of communication is facial emotion through facial expression. This makes the problem of evaluating human emotions via computer systems an intriguing one that has attracted increasing amounts of attention in the past few decades. The main connection is that facial expression recognition could be used in a variety of industries, including virtual reality, video games, HCI, and customer satisfaction analysis. Face detection, facial feature extraction, and expression categorization are the final three basic steps in the emotions determination (recognition process). The Russell circular approach, which has up to 24 emotional expressions, and Plutchik\'s Wheel of Emotions are two other categorization schemes that are frequently encountered. Ekman\'s classification, which has six emotional expressions (or seven, if it is a neutral expression), is the most common. Not only have the techniques employed in the three stages of the identification process improved over the past sixty years, but new techniques and algorithms have also surfaced that can determine the Olivia Jones detector with higher precision and less computational burden. As a result, the Software Development Kit (SDK) offers a variety of contemporary options. We describe our system for actual time emotion classification and its creation in this publication. We wanted to develop a system that could operate quickly, steadily, and in real time while utilizing all three stages of the recognition process. We have chosen to utilize the current Affective SDKs because of this. With the help of the Software Development Kit (SDK) from Affective, we can automatically identify face landmarks on the image taken with a traditional webcam. For feature extraction, a geometric feature-based methodology is employed. A characteristic that is employed is the distance between landmarks; the technique of brute force is used to choose the best collection of features. The suggested system classifies data using a neural network technique. Anger, disgust, fear, happiness, sadness, surprise, and neutral are the six (respectively seven) facial expressions that the suggested system can identify. We wish to highlight more than just the portion of our solution that works. We would like to highlight how these measurements were established, the outcomes we obtained, and the ways in which these outcomes have profoundly impacted the path of our future study.

Introduction

I. INTRODUCTION

The relationship between the information that a person's face reflects and conveys and their concurrent emotional state is currently a topic of great interest. Numerous recent research investigations have reported that facial expressions and indications can offer valuable insights into the categorization and interpretation of emotional states. Numerous studies pertaining to the identification of human behaviour have been conducted. A person's skeleton, silhouette, biometric gait, and picture can all be used to identify their movements. A visual surveillance approach is employed to recognize groups of human behaviours. To identify human movements, researchers employed a variety of methods, including multi-modality joint representation and hierarchical probabilistic approach. Darwin stated a long time ago that smiles are universal, meaning that most emotions on a human face are conveyed in a comparable way irrespective of cultural or racial background. When we effectively convey our feelings, thoughts, and intentions to another person, Darwin argues that our facial expressions can affect how others communicate with us. Furthermore, as he clearly notes in his research on human behaviour, these expressions also provide details about an individual's cognitive state. This covers feelings like boredom, tension, perplexity, and more. Because Darwin developed his own comprehensive descriptions of the manifestations of over 40 emotional states during his lifetime, and because he did so at the time of his death, his work is exceptional. Darwin had already come to the conclusion that emotional expressions were multimodal patterns of behaviour. The majority of authors of professional publications make reference to Ekman's classification of facial features when performing face detection, even though facial expressions don't always convey emotions (e.g., the physiological characteristics of the face following a stroke). This is due to the fact that they are clearly and unmistakably identifiable, in contrast to other emotions (at which time Darwin identified over 40 expressions). Ekman's six emotion classification model is as follows: 1. happiness, 2. sadness, 3. surprise, 4. fear, 5. rage, and 6. disgust. Later, Ekman expanded on the six emotions he had listed before to include a neutral expression.

This system of categorizing feelings has become quite popular. The primary benefit is that basic emotion-related facial expressions are simply identifiable and explainable, even to non-psychologists. In the previous forty years, numerous studies on facial identification and facial part recognition have been developed, with this model of emotions being the most often used one for their detection. The degree of universality in identifying these emotive (prototypical) facial states and the true meaning of these expressions were questioned in the wake of Russell's critical point of view on the matter (assembling the circular model) in 1994. According to psychology research, emotions are focused reactions to recommendations from significant people that incorporate behavioural, physiological, and sensory elements. In order to improve user experience, a number of methods have been developed to help computers comprehend human emotions and affects. This allows computers to forecast human intention more accurately and serve users more effectively. Comparing individual methods reveals that widely used hardware—the traditional webcam used in the majority of computer systems—allows the extraction of crucial data regarding human facial expression. The widespread use of interactive social networking programs has made webcams a de facto gadget. A person sitting in front of a camera can frequently be accurately inferred to be feeling a certain way by a human. Human emotions can be identified in webcam footage, particularly through facial features and eye movements, according to recent research (dating back to the previous ten years) in the fields of machine learning and video processing. We will describe our developed and put into practice solution in the paper, which uses a webcam to detect the subject's emotions in real time. We ran across a number of problems while developing and constructing our system, which will be covered in depth.

II. RELATED WORK

A. “Seamless tracing of human behaviour using complementary wearable and house-embedded sensors.”

In order to seamlessly monitor senior citizens in their homes, a multimodal system is presented in this article. Every wearable sensor network and premise-embedded sensors unique to each environment are used simultaneously by the system. The benefits of combining data from two different mobility sensor types—an accelerometer-based wearable network and visual flow-based picture analysis—are illustrated in the article. Results for both outdoor and indoor recognition of complex movements and a number of basic postures are provided in the paper. An automatic danger detection algorithm powered by two premise- and subject-related databases, a polar histogram-based approach to visual pose recognition, and the complementary use and synchronisation of data from wearable and premise-embedded networks were highlighted rather than a detailed description of the entire system. In addition, our approach is new since it uses the dynamic time-warping algorithm to estimate the distance between actions that are represented as simple poses in behavioural records and loads real-life recordings of the patient into the databases. The main results of testing our method include: 95.5% accuracy of elementary pose recognition by the video system, 96.7% accuracy of elementary pose recognition by the accelerometer-based system, 98.9% accuracy of elementary pose recognition by the combined accelerometer and video-based system, and 80% accuracy of complex outdoor activity recognition by the accelerometer-based wearable system.

B. “Towards multimodal emotion recognition in e-learning environments”

This research proposes a framework for real-time emotion recognition in e-learning utilising webcams, called FILTWAM (Framework for Improving Learning through Webcams and Microphones). Based on learners' verbalizations and facial expressions, FILTWAM provides timely and pertinent feedback. The facial expression software module of FILTWAM has been created and evaluated in a proof-of-concept investigation. This study's primary objective was to verify the usage of camera data for accurate and timely facial expression analysis into derived emotional states. Ten testers were used to calibrate the programme. Every one of them was asked to replicate a certain facial expression 100 times in these computer-based exercises. Every session was videotaped. Two experts graded and analysed the participants' recorded behaviours in order to validate the facial emotion recognition software. The software results were compared with expert conclusions, which revealed an overall kappa value of 0.77. Our software's total accuracy, based on the required and recognised emotions, is 72%. Our software is designed to monitor learners' behaviours constantly and inconspicuously, converting them directly into emotional states, while existing software only allows for not-real-time, discontinuous, and invasive facial detection. By taking into account the learner's emotional states, this opens the door to improving the effectiveness and quality of e-learning.

C. “Face Expression Recognition and Analysis: The State of the Art.”

Since the early 1990s, there has been a lot of study being done on the automatic recognition of facial expressions. In recent years, significant progress has been made in the areas of face tracking and detection, feature extraction procedures, and expression categorization approaches. A summary of some of the published work since 2001 is included in this paper. An extensive overview of the state of the art is provided, along with a timeline of the developments in the field, applications of automatic face expression recognizers, qualities of an ideal system, databases that have been used, and advancements in standardisation of those databases. The latest developments in face detection, tracking, and feature extraction techniques are also covered in the study, along with facial parameterization utilising MPEG-4 Facial Animation Parameters (FAPs) and FACS Action Units (AUs). Along with a discussion of the six prototype expressions and the most recent research on expression classifiers, notes on emotions, expressions, and facial traits have also been included. A brief note on the difficulties and the remaining work completes the paper. To assist researchers and students who are new to this topic, this document has been written in the form of a tutorial.

D. “Development and evaluation of a web 2.0 annotation system as a learning tool in an e-learning environment, Computers and Education”

The advent of Web 2.0 technology offers additional avenues for promoting online collaboration and conversation within an e-learning setting. The aim of this research was to create the Web 2.0 annotation system, My Note, based on the fundamental principles of Web 2.0, which prioritise accessibility and collaborative sharing. Additionally, the study aimed to gather insights on users' usability opinions regarding My Note. In this study, My Note was used both inside and outside of a learning management system (LMS) on multimedia learning items. The evaluation results demonstrated that the perceptions of My Note were represented by factor analysis, which classified interactivity, usefulness, helpfulness, and willingness for future usage. It was also discovered that the variables helpfulness and interaction were statistically significant in predicting future My Note usage. Finally, learners' opinions of utilising My Note were also impacted by their note-taking behaviour.

III. METHODOLOGY

In this project we have designed an application with two main modules such as Person Registration and Facial Emotion Detection

Person Registration: Using this module of the application we can add person details such as person ID, person Name and his face
Facial Emotion Detection: Using this module of the application we can predict facial emotion through expressions and then we predict the person in the image with his or her facial emotion.

and apart from the above two modules when the application is shutdown an excel sheet which records the facial emotions recorded according to timestamp, person ID and person name, and a PDF report is generated which contains visual insights of the recorded emotions in the form of pie-charts and barplots.

Conclusion

Real-time face recognition Face identification using a webcam, a description of the subject\'s emotional state, and face detection itself are today very important in many aspects of our life. Our emotions are a reflection of our feelings and have an impact on our life. The emotional state classification can be applied in a variety of fields, including education (for the purpose of assessing student sentiments), business (to understand staff sentiments), commerce (particularly in the field of neuromarketing), and the automobile sector (to regulate the aggression of drivers). We described our suggested fix utilizing Affective SDK in the article. A lot of testing has been done on this solution. Averaging 84.27% is the overall detection rate. However, we obtain values of almost 100% when the subject is viewed from the front (assuming the subject\'s head rotates at a tilt of less than 15 degrees). When the subject\'s head tilts more than 15%, both face detection and emotional state classification are impaired. This issue is one that we are aware of, and we plan to investigate it more.

References

[1] P. Augustyniak, et al. Seamless tracing of human behaviour using complementary wearable and house-embedded sensors. Sensors, 14(5), 2014, p. 7831-7856. [2] K. Bahreini, R. Nadolski, & W. Westenra. Towards multimodal emotion recognition in e-learning environments. Interactive Learning Environments, no. Ahead-of-print, 2014, p. 1-16. [3] V. Bethabara. Face Expression Recognition and Analysis: The State of the Art. arrive: Tech Report, (4), 2012, p. 1-27. [4] Y. Chen, R. Hwang, & C. Wang. Development and evaluation of a web 2.0 annotation system as a learning tool in an e-learning environment, Computers and Education, 58(4), 2011, p. 1094-1105. [5] C. Darwin. The Expression of the Emotions in Man and Animals. Oxford University Press, USA (1998). [6] P. Ekman. All Emotions Are Basic. In Ekman & Davidson (Eds.), The Nature of Emotion. New York: Oxford University Press, 1992, p. 15-19 [7] P. Ekman. Strong evidence for universals in facial expressions: A reply to Russell’s mistaken critique. Psychological Bulletin, 115(2), 1994, p. 268-287. [8] P. Ekman, & W. Friesen. Facial Action Coding System: Investigator’s Guide. Palo Alto, CA: Consulting Psychologists Press, 1978. [9] B. Fasel, & J. Letting. Automatic facial expression analysis: A survey. Pattern Recognition, 36(1), 2003, p. 259-275. [10] M. Feudalism, T. Degradomes, & S. Caballe. Emotion Measurement in Intelligent Tutoring Systems: What, When and How to Measure. Third International Conference on Intelligent Networking and Collaborative Systems, 2011, p. 807-812. [11] G. Giannakakis, M. Pericarditis, D. Manousos, E. Kazantzakis, F. Chianuri, P.G. Simos, & M. Tsakani’s. Stress and anxiety detection using facial cues from videos. Biomedical Signal Processing and Control, [12] M.X. Huang, J. Li, G. Ngai, & H. V. Leong. Stress Click: Sensing stress from gaze-click patterns. Paper presented at the MM 2016 - Proceedings of the 2016 ACM Multimedia Conference, 2016, p. 1395-1404. doi:10.1145/2964284.2964318. [13] Y. Juan, S. Dobson, & S. McKeever. Situation identification techniques in pervasive computing: A review. Pervasive and mobile computing, 8(1), 2012, p. 36-66. [14] D. Keltner. Born to be good: The science of a meaningful life. New York: WW Norton & Company, 2009. [15] A. Kumar, A. Kumar, S.K. Singh, & R. Kala. Human Activity Recognition in Real-Times Environments using Skeleton Joints. International Journal of Interactive Multimedia and Artificial Intelligence, 3(7), 2016, p. 61-69. [16] J. Li, G. Ngai, & V. Hong. Multimodal Human Attention Detection for Reading from Facial Expression, Eye Gaze, and Mouse Dynamics. Applied Computing Review, 16(3), 2016, p. 37-49. [17] Y.F. Li, J. Zhang, & W. Wang. Active sensor planning for Multiview vision tasks. Vol. 1. Heidelberg: Springer, 2008. [18] A.A. Liu, et al. Coupled hidden conditional random fields for RGB-D human action recognition. Signal Processing, 2015, p.74-82. [19] M. Magdi, M. Turkana, L. Hudec. Evaluating the Emotional State of a User Using a Webcam. International Journal of Interactive Multimedia and Artificial Intelligence, 4(1), Special Issue, 2016, p. 61-68. [20] D. McDuff, E.R. Kaliouby, T. Senechal, M. Amr, J.F. Cohn, & R. Picard. Affective facial expression dataset (AM-FED): Naturalistic and spontaneous facial expressions collected ‘in-the-wild’. Paper presented at the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2013. p. 881-888. doi:10.1109/CVPRW.2013.130

Copyright

Copyright © 2024 Ankit Singh, A. Sai Nishwan, N Gautham, K. Sudhakar Reddy. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET59630

Publish Date : 2024-03-30

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here