Facial Emotion Recognition plays a significant role in interacting with computers which help us in various fields like medical processes, to present content on the basis of human mood, security and other fields. It is challenging because of heterogeneity in human faces, lighting, orientation, poses and noises. This paper aims to improve the accuracy of facial expression recognition. There has been much research done on the fer2013 dataset using CNN (Convolution Neural Network) and their results are quite impressive. In this work we performed CNN on the fer2013 dataset by adding images to improve the accuracy. To our best knowledge, our model achieves the accuracy of 70.23 % on fer2013 dataset after adding images in training and testing parts of disgusted class.
Introduction
I. INTRODUCTION
Facial Emotion recognition recognizes the facial expression type such as angry, sad, happy etc. It plays an important role in human-computer interactions in various fields like medical field, advertisement, security, online gaming, customer feedback and non-verbal communication [1]-[9].
There are many ways to inspect the recognition of human expressions, ranging from facial expressions, body posture, voice tone etc. In this paper we have focused on facial expression recognition. Facial Emotion Recognition (FER) is a thriving research area in which lots of advancements like automatic translation systems, machine to human interaction are happening in industries. Facial emotion recognition helps to know the feedback of customers, based on emotion advertisements served to customers.
In this paper we had used the fer2013 dataset and modified it to improve the accuracy. We have added some images in this dataset since this dataset had low amounts of images of disgusted class. Before moving to the further process we should be aware of facial emotion recognition since it is a research topic and it makes a great impact on human-computer interaction. It is challenging because human faces differ with respect to age, environments, culture, fashion, light etc.
II. PROBLEM DEFINITION
In the introduction part we have discussed briefly about emotions and now we are moving to problem definition. As per older research the researcher used the same data and they achieved the highest accuracy of 65.5 % with 7 different types of emotion [16]-[21]. Since this dataset has too few images for the disgusted class and we have added some more images in it and perform the CNN operation, in image classification, Convolution Neural Networks (CNNs) have shown great potential due to their computational efficiency and feature extraction capability [13]-[22].
They are most widely used deep neural network for FER (facial emotion recognition dataset) [10]-[16]. Facial emotion recognition recognizes the emotion by the image; we have used CNN for classification. But in CNN, choosing the number of hidden layers, filter size, learning rate and number of iteration are challenging and it has to be done manually. It depends upon the types of the data.
III. PROPOSED MODEL
The fer2013 dataset we have used in our experiments, since it has seven different types of classes of emotions and has a significant amount of images. The images are 48x48, gray and containing only faces which are already preprocessed. First we had checked the accuracy of the dataset with various hidden layers and achieved the highest accuracy of 66.05% after 50 iterations. We have proposed two models and performed them.
A. By Modification of Dataset
We had proposed a model since it has too low images for the disgusted class, we had assumed that by adding images in the disgusted class, we could improve the accuracy and we had tried it. We downloaded the disgusted images and added them to the dataset and performed the CNN.
B. By using Position and Shape
Since in different situations faces look different the position and shape of mouth, nose, eyebrow, chicks and forehead varies. We proposed a model that we’ll recognize the emotion by these factors. We were aware of “Google’s Mediapipe” package which identified the points on the faces. We used this package to detect points on the faces and calculate the shape and position of the different parts of the faces and after this we made it relative to the face size. Because if a face size is smaller in an image and bigger in another image the model will not work. For the relative parameter we calculated the face height and face width ratio and by using this parameter we scaled the shapes and position of the face parts.
IV. EXPERIMENT SETUP
To perform the experiment, we have a windows laptop with 16GB ram, i7 processor and 4GB of graphics card which is able to fulfill our needs. We have installed python 3.8, tensorflow package and other required packages and Cuda.
For the first proposed model we have downloaded 1500 images of disgusted class, some of the images had multiple faces and these images had various different sizes. We wrote a python program which picked one image at a time and searched the faces in the image, after finding the faces it cropped the face and scaled the cropped images to 48x48 size and converted it to grayscale images. After performing this step, the images are preprocessed and ready to merge in the older dataset.
For the second proposed model we had downloaded 70,000 images having high quality of faces. We had classified the images with 7 different classes manually. For classification we created software having 7 buttons. It displayed one by one image and we manually identified the class of image, according to the class we pressed the button and that image moved to that class folder.
V. RESULTS AND ANALYSIS
The first experiment achieved accuracy was 70.6% after 50 iterations with 6 hidden layers. We added images in disgusted class up to 1660 images. During this experiment the model had shifted towards the disgusted class. To resolve this issue, we had made the size of the disgusted class to 1200 images and again performed the experiment. Now this time the accuracy was 67.89%. We kept the learning rate to 0.01. During the experiment we had changed the number of layers, sizes of the filter, dropout values and size of the max pooling.
In the second proposed model after classification of the dataset the difference in the amount of images in classes was too high. Happy class had >50000 images and <300 images in the sad class and disgusted class. Due to the imbalance of images we quitted the experiment.
VI. APPLICATION
In the rapid growth of technologies, the demands of emotion recognition have increased. There are many areas where it is used-
Medical fields
Security
Advertisement
Feedback system
Online teaching
Mood music
Psychological studies
Filter suggestion
Social media
Advisory system
Food feedback
Child care
Automatic system of advice
Conclusion
This paper has achieved the highest accuracy of 70.6 % which is far improvement than older research and studies. In the future by using the second proposed model we can achieve much better accuracy. We can also use the first proposed model structure to improve. Emotions are caused by brain activities which are expressed through face and voice. The purpose of this paper is to improve the accuracy, introduce briefly about methods, implementation and application of facial emotion recognition.
References
[1] ANDREASSI, J. L. (2000):?Psychophysiology: human behavior and physiological response? (Lawrence Erlbaum Associates, New Jersey, 2000)
[2] ARK, W., DRYER, D. C., and LU, D. J. (1999): ?The emotion mouse?. 8th Int. Conf. Humancomputer Interaction, pp. 818– 823.
[3] BING-HWANG, J., and FURUI, S. (2000): ?Automatic recognition and understanding of spoken language—a first step toward natural human-machine communication‘, Proc. IEEE, 88, pp. 1142– 1165.
[4] BOUCSEIN, W. (1992):?Electrodermal activity‘ (Plenum Press, New York, 1992) BROERSEN, P. M. T. (2000a): ?Facts and fiction in spectral analysis?, IEEE Trans. Instrum. Meas., 49, pp. 766–772.
[5] E. Sariyanidi, H. Gunes, and A. Cavallaro, “Automatic analysis of facial affect: A survey of registration, representation, and recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 3
[6] B. Fasel and J. Luettin, “Automatic facial expression analysis: A survey,” Pattern Recognition, vol. 36, no. 1. 2003, doi: 10.1016/S0031-3203(02)00052-3.
[7] M. S. Bartlett, G. Littlewort, I. Fasel, and J. R. Movellan, “Real Time Face Detection and Facial Expression Recognition: Development and Applications to Human Computer Interaction,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2003, vol. 5, doi: 10.1109/CVPRW.2003.10057.
[8] B. Fasel and J. Luettin, “Automatic facial expression analysis: A survey,” Pattern Recognition, vol. 36, no. 1. 2003, doi: 10.1016/S0031-3203(02)00052-3.
[9] Claude C. Chibelushi, Fabrice Bourel ?Facial Expression Recognition: A Brief Tutorial Overview.
[10] M. S. Bartlett, G. Littlewort, I. Fasel, and J. R. Movellan, “Real Time Face Detection and Facial Expression Recognition: Development and Applications to Human Computer Interaction,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2003, vol. 5, doi: 10.1109/CVPRW.2003.10057.
[11] F. Abdat, C. Maaoui, and A. Pruski, “Human-computer interaction using emotion recognition from facial expression,” in Proceedings - UKSim 5th European Modeling Symposium on Computer Modeling and Simulation, EMS 2011, 2011, doi: 10.1109/EMS.2011.20.
[12] I.J. Image, Graphics and Signal Processing, 2012, 8, 50-56 Published Online August 2012 in MECS (http://www.mecs- press.org/) DOI: 10.5815/ijigsp.2012.08.07
[13] International Journal of Engineering and Advanced Technology (IJEAT) ISSN: 2249 – 8958, Volume-8 Issue-6S, August 2019
[14] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” Commun. ACM, vol. 60, no. 6, 2017, doi:. 10.1145/3065386
[15] N. Mehendale, “Facial emotion recognition using convolutional neural networks (FERC),” SN Appl. Sci., vol. 2, no. 3, 2020, doi: 10.1007/s42452-020-2234-1.
[16] V. Tümen, Ö. F. Söylemez, and B. Ergen, “Facial emotion recognition on a dataset using Convolutional Neural Network,” in IDAP 2017 - International Artificial Intelligence and Data Processing Symposium, 2017, doi: 10.1109/IDAP.2017.8090281.
[17] D. K. Jain, P. Shamsolmoali, and P. Sehdev, “Extended deep neural network for facial emotion recognition,” Pattern Recognit. Lett., vol. 120, 2019, doi: 10.1016/j.patrec.2019.01.008.
[18] O. Gervasi, V. Franzoni, M. Riganelli, and S. Tasso, “Automating facial emotion recognition,” Web Intell., vol. 17, no. 1, 2019, doi: 10.3233/WEB-190397.
[19] M. M. Taghi Zadeh, M. Imani, and B. Majidi, “Fast Facial emotion recognition Using Convolutional Neural Networks and Gabor Filters,” in 2019 IEEE 5th Conference on Knowledge Based Engineering and Innovation, KBEI 2019, 2019, doi: 10.1109/KBEI.2019.8734943.
[20] E. Pranav, S. Kamal, C. Satheesh Chandran, and M. H. Supriya, “Facial Emotion Recognition Using Deep Convolutional Neural Network,” in 2020 6th International Conference on Advanced Computing and Communication Systems, ICACCS 2020, 2020, doi: 10.1109/ICACCS48705.2020.9074302