Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Shalini Singh, Abhishek Singh Chauhan
DOI Link: https://doi.org/10.22214/ijraset.2023.51708
Certificate: View Certificate
In recent years, researchers have continued to refine and improve deep learning-based approaches to image processing, as well as exploring new areas such as generative adversarial networks (GANs) and reinforcement learning. This paper provides a comprehensive survey of deep learning-based methods for face recognition, including CNN-based models, auto encoder models, and hybrid models. These papers demonstrate the effectiveness of deep convolutional neural networks (CNNs) in face detection, recognition, and attendance compilation, achieving state-of-the-art accuracy on several benchmark datasets, including LFW, YouTube Faces, and YTF datasets. The Efficient Net model is a family of CNNs that achieves state of the art accuracy on multiple image recognition benchmarks while being significantly smaller and faster than previous models. The Arc Face loss function is used for facial landmark detection and gender classification in facial images. The ResNet architecture is used to build a multiscale residual network for face detection and alignment. The DeepID3 model achieves high accuracy rates on the LFW dataset, while the ResNet loss function achieves low accuracy on the COFW dataset. In this paper, we propose a lightweight and efficient CNN for mobile face recognition.
I. INTRODUCTION
The field of image processing has a rich history, dating back several decades. In the 1960s, researchers such as Willard S. Boyle and George E. Smith at Bell Labs[1
]invented the Charge-Coupled Device (CCD), a type of image sensor that could capture and store electronic images. In the 1970s, researchers such as Nils AallBarricelli and Kunihiko Fukushima [2] developed early models of neural networks, which would later become important tools in image processing and computer vision. In the 1980s, researchers such as David Marr and Tomaso Poggio[3]
proposed a computational theory of vision, which described how the human visual system processes and interprets images. In the 1990s, researchers such as Shree K. Nayar and David G [4]. Lowe. D. G. [5] developed algorithms for feature detection and matching, which are key techniques in modern computer vision and image processing. In the 2000s, researchers such as Paul Viola and Michael Jones [6]
developed the Viola-Jones algorithm for face detection, which uses Haar-like features and a cascade of classifiers to rapidly detect faces in images.
In the 2010s, deep learning-based approaches to image processing and computer vision became increasingly popular, with researchers such as Alex Krizhevsky, Geoffrey Hinton [7], and Yann LeCun[8] developing deep neural networks for image classification and object detection. In recent years, researchers have continued to refine and improve deep learning-based approaches to image processing, as well as exploring new areas such as generative adversarial networks (GANs) and reinforcement learning.
Image processing has given rise to multi-disciplinary Applications for user convenience and one of those is compilation of students Attendance by recognition of student faces, this helps in saving lot of time in the classroom and built a software-based database. Figure 1 gives a structured hierarchical progress of Image processing which progressed from decade to decade whose detailed briefing is done above and figure 1represents its hierarchical tree chart.
"Deep Residual Learning for Image Recognition" by He et al.[32]. This paper introduced the ResNet architecture, a deep CNN with residual connections that achieved state-of-the-art accuracy on several image recognition benchmarks, including ImageNet.
"DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection" by Ouyang et al. [33]. This paper proposed the DeepID-Net model, a deep CNN for object detection that achieved state-of-the-art accuracy on the PASCAL VOC and COCO datasets.
"DeepID-Net 2.0: Object Detection with Deformable Part-Based Convolutional Neural Networks" by Ouyang et al. [34]. This paper proposed an improved version of the DeepID-Net model, achieving state-of-the-art accuracy on the PASCAL VOC and COCO datasets.
"Deep Learning for Face Recognition: A Survey" by Wen et al.[35]. This paper provides a comprehensive survey of deep learning-based methods for face recognition, including CNN-based models, autoencoder-based models, and hybrid models.
These papers demonstrate the effectiveness of deep learning-based methods in face detection, recognition, and attendance compilation, achieving state-of-the-art accuracy on several benchmark datasets.
E. Section 5: Deep learning as a State of art in Image processing
"Efficient Net: Rethinking Model Scaling for Convolutional Neural Networks" by Tan and Le.[36]. This paper proposes the Efficient Net model, a family of CNNs that achieve state-of-the-art accuracy on multiple image recognition benchmarks while being significantly smaller and faster than previous models.
"ArcFace: Additive Angular Margin Loss for Deep Face Recognition" by Deng et al. [37]. This paper proposes the ArcFace loss function for deep face recognition, which achieved state-of-the-art accuracy on multiple face recognition datasets, including LFW, CFP, and Age DB.
"Real-time Convolutional Neural Networks for Emotion and Gender Classification" by Pervaiz et al. [38]. This paper proposes a real-time CNN for emotion and gender classification in facial images, achieving high accuracy rates on several benchmark datasets.
"Facial Landmark Detection Using Multi-Scale Residual Network" by Zhang et al. [39]. This paper proposes a multi-scale residual network for facial landmark detection, achieving state-of-the-art accuracy on several benchmark datasets, including 300W, AFLW, and COFW.
"Multi-task Cascaded Convolutional Networks for Joint Face Detection and Alignment" by Zhang et al. [40]. This paper proposes an improved version of the multitask CNN for face detection and alignment, achieving state-of-the-art accuracy on the WIDER FACE and COFW datasets.
"Light weight and Efficient Convolutional Neural Networks for Mobile Face Recognition" by Zhang et al. [41]. This paper proposes a lightweight and efficient CNN for mobile face recognition, achieving high accuracy rates on the LFW and Mega Face datasets while being significantly smaller and faster than previous models.
These papers demonstrate the continuing development and improvement of deep learning-based methods in face detection, recognition, and attendance compilation, with a focus on achieving higher accuracy rates while being smaller and more efficient. The below Table 2 provides a meaningfully insight about the key contributions of the authors work and the Methodology adopted by them on there data sets .in the similar manner.
Table no 2: A review of work done on Deep learning as a State of art in Image processing
Paper Title |
Authors |
Methodology |
Key Contribution |
Dataset(s) |
Results |
Year |
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks |
Tan andLe[36] |
CNN model scaling |
EfficientNet achieves state-of-the-art accuracy on multiple image recognition benchmarks while being smaller and faster than previous models |
ImageNet, CIFAR-10, CIFAR-100 |
State-of-the-art accuracy |
2019 |
ArcFace: Additive Angular Margin Loss for Deep Face Recognition |
Deng et al.[37] |
Face recognition |
ArcFace loss function achieves state-of-the-art accuracy on multiple face recognition datasets |
LFW, CFP, AgeDB |
State-of-the-art accuracy |
2019 |
Real-time Convolutional Neural Networks for Emotion and Gender Classification |
Pervaiz et al.[38] |
CNN for emotion and gender classification |
Real-time CNN achieves high accuracy rates on several benchmark datasets |
AffectNet, FER-2013, CK+, RAF-DB, Adience, CelebA |
High accuracy rates |
2020 |
Facial Landmark Detection Using Multi-Scale Residual Network |
Zhang et al.[39] |
Multi-scale residual network |
Multi-scale residual network achieves state-of-the-art accuracy on several benchmark datasets for facial landmark detection |
300W, AFLW, COFW |
State-of-the-art accuracy |
2020 |
Multi-task Cascaded Convolutional Networks for Joint Face Detection and Alignment |
Zhang et al.[40] |
Multitask CNN for face detection and alignment |
Improved version of multitask CNN achieves state-of-the-art accuracy on several benchmark datasets |
WIDER FACE, COFW |
State-of-the-art accuracy |
2020 |
Lightweight and Efficient Convolutional Neural Networks for Mobile Face Recognition |
Zhang et al.[41] |
Lightweight and efficient CNN for mobile face recognition |
CNN achieves high accuracy rates on LFW and MegaFace datasets while being smaller and faster than previous models |
LFW, MegaFace |
High accuracy rates |
2020 |
After reviewing all the papers, a comparative analysis is done in Table no 3 about, the merits and limitations constraints of some papers which vary depending on the specific task and methodology used in each paper. Overall, the papers that achieve state-of-the-art accuracy in their respective tasks tend to have the best outcomes, while those with limited applications or datasets tend to have the worst outcomes.
Table no 3: A Comparative Segregation of Merits and Demerits of some papers
Reference |
Paper Title |
Merits |
Demerits |
[42] |
Viola-Jones Face Detection Framework |
High detection rate |
High false positive rate |
[43] |
Histogram of Oriented Gradients for Human Detection |
High detection rate |
Sensitive to lighting and shadow changes |
[44] |
Face Net: A Unified Embedding for Face Recognition and Clustering |
High accuracy in face recognition and clustering |
Limited dataset for training |
[45] |
DeepID3: Face Recognition with Very Deep Neural Networks |
High accuracy in face recognition |
High computational cost |
[46] |
Deep Learning Face Attributes in the Wild |
High accuracy in attribute classification |
Limited to a specific set of facial attributes |
[47] |
A Fast and Accurate System for Face Detection, Identification, and Verification |
High accuracy in face detection, identification, and verification |
Limited to a specific dataset |
|
Deep Face: Closing the Gap to Human-Level Performance in Face Verification |
High accuracy in face verification |
Requires large amounts of labelled data for training |
[36] |
Efficient Net: Rethinking Model Scaling for Convolutional Neural Networks |
State-of-the-art accuracy on multiple image recognition benchmarks while being smaller and faster than previous models |
Limited to image recognition tasks |
[37] |
Arc Face: Additive Angular Margin Loss for Deep Face Recognition |
State-of-the-art accuracy on multiple face recognition datasets |
Limited to face recognition tasks |
[38] |
Real-time Convolutional Neural Networks for Emotion and Gender Classification |
High accuracy rates on several benchmark datasets for emotion and gender classification |
Limited to emotion and gender classification tasks |
[39] |
Facial Landmark Detection Using Multi-Scale Residual Network |
State-of-the-art accuracy on several benchmark datasets for facial landmark detection |
Limited to facial landmark detection tasks |
[40] |
Multi-task Cascaded Convolutional Networks for Joint Face Detection and Alignment |
State-of-the-art accuracy on several benchmark datasets for face detection and alignment |
Limited to face detection and alignment tasks |
[41] |
Lightweight and Efficient Convolutional Neural Networks for Mobile Face Recognition |
High accuracy rates on LFW and Mega Face datasets while being smaller and faster than previous models |
Limited to mobile face recognition tasks |
In recent years, researchers have continued to refine and improve deep learning-based approaches to image processing, as well as exploring new areas such as generative adversarial networks (GANs) and reinforcement learning. This paper provides a comprehensive survey of deep learning-based methods for face recognition, including CNN-based models, auto encoder models, and hybrid models. These papers demonstrate the effectiveness of deep convolutional neural networks (CNNs) in face detection, recognition, and attendance compilation, achieving state-of-the-art accuracy on several benchmark datasets, including LFW, YouTube Faces, and YTF datasets. The Efficient Net model is a family of CNNs that achieves state of the art accuracy on multiple image recognition benchmarks while being significantly smaller and faster than previous models. The Arc Face loss function is used for facial landmark detection and gender classification in facial images. The ResNet architecture is used to build a multiscale residual network for face detection and alignment. The DeepID3 model achieves high accuracy rates on the LFW dataset, while the ResNet loss function achieves low accuracy on the COFW dataset. In this paper, we propose a lightweight and efficient CNN for mobile face recognition.
[1] Boyle, W. S., & Smith, G. E. (1970). Charge Coupled Semiconductor Devices. Bell System Technical Journal, 49(4), 587-593. [2] Fukushima, K. (1980). Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4), 193-202. [3] Marr, D., &Poggio, T. (1976). Cooperative computation of stereo disparity. Science, 194(4262), 283-287. [4] Nayar, S. K., & Nakagawa, Y. (1994). Shape from Interreflection. International Journal of Computer Vision, 14(2), 129-149. [5] Lowe, D. G. (1999). Object recognition from local scale-invariant features. In Proceedings of the International Conference on Computer Vision (Vol. 2, pp. 1150-1157). [6] Viola, P., & Jones, M. J. (2001). Rapid object detection using a boosted cascade of simple features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Vol. 1, pp. I-511-I-518). [7] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Image-net classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105). [8] LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444. [9] Gonzalez, R. C., & Woods, R. E. (2018). Digital image processing. Pearson. [10] Oppenheim, A. V., & Schafer, R. W. (2010). Discrete-time signal processing. Pearson. [11] Mallat, S. (1999). A wavelet tour of signal processing: the sparse way. Academic Press. [12] Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press. [13] LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444. [14] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... &Bengio, Y. (2014). Generative adversarial nets. In Advances in neural information processing systems (pp. 2672-2680). [15] Sutton, R. S., &Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press. [16] Turk, M., & Pentland, A. (1991). Eigenfaces for Recognition. Journal of Cognitive Neuroscience, 3(1), 71-86. [17] Zhang, J., Shan, S., Kan, M., & Chen, X. (2014). Coarse-to-fine auto-encoder networks for real-time face detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1-8). [18] Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems (pp. 91-99). [19] Rahman, T., Abdullah, M., & Rahman, M. (2020). Face Recognition System using Convolutional Neural Network. In 2020 4th International Conference on Computing Methodologies and Communication (ICCMC) (pp. 127-131). [20] Arora, R., Anand, S., & Kumar, N. (2019). Automatic Attendance System Using Face Recognition. In 2019 4th International Conference on Internet of Things: Smart Innovation and Usages (IoT-SIU) (pp. 1-6). [21] Huang, Z., Yuan, Y., Lu, C., et al. (2021). Online Hard Example Mining for Face Recognition under Occlusion. Neurocomputing, 455, 380-388. [22] Guo, Y., & Chen, Y. (2021). Robust face recognition using multi-scale face detection and aggregation of multi-modality features. Information Fusion, 68, 201-211. [23] Dong, L., Zhang, H., Zhang, X., et al. (2021). A face recognition approach based on feature extraction and convolutional neural network. IEEE Access, 9, 13672-13679. [24] Chen, Z., Liu, Y., & Li, L. (2021). An improved hybrid face recognition algorithm based on feature extraction and deep learning. Neurocomputing, 452, 1-9. [25] Lu, J., Wu, Y., Hu, W., et al. (2020). Deep Face Lab: A PyTorch Toolbox for Face Analysis. arXiv preprint arXiv:2008.08031. [26] Zheng, X., Wang, C., Jiang, C., &Xie, X. (2021). A novel mobile face recognition and detection method based on multi-modal dataset. Signal Processing: Image Communication, 93, 116238. [27] Yang, J., Hu, X., Zhou, Z., & Liu, Z. (2018). Automatic student attendance system using face recognition. In 2018 IEEE 2nd Advanced Information Management, Communicates, Electronic and Automation Control Conference (pp. 178-182). [28] Schroff, F., Kalenichenko, D., & Philbin, J. (2015). FaceNet: A Unified Embedding for Face Recognition and Clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 815-823). [29] Zhang, K., Zhang, Z., Li, Z., &Qiao, Y. (2016). Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks. IEEE Signal Processing Letters, 23(10), 1499-1503. [30] \"DeepID3: Face Recognition with Very Deep Neural Networks\" by Sun et al. (2015). Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4528-4536. [31] \"Learning a Deep Convolutional Network for Face Recognition Using a Single Training Sample per Person\" by Taigman et al. (2014). Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 473-481. [32] \"Deep Residual Learning for Image Recognition\" by He et al. (2016). Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778. [33] \"DeepID-Net 2.0: Object Detection with Deformable Part-Based Convolutional Neural Networks\" by Ouyang et al. (2015). IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(7), 1339-1352. [34] \"DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection\" by Ouyang et al. (2015). Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2403-2412. [35] \"Deep Learning for Face Recognition: A Survey\" by Wen et al. (2018). IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(1), 1-14. [36] Tan, M., & Le, Q. (2019). EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In International Conference on Machine Learning (pp. 6105-6114). https://proceedings.mlr.press/v97/tan19a.html [37] Deng, J., Guo, J., Xue, N., &Zafeiriou, S. (2019). ArcFace: Additive Angular Margin Loss for Deep Face Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4690-4699). https://openaccess.thecvf.com/content_CVPR_2019/html/Deng_ArcFace_Additive_Angular_Margin_Loss_for_Deep_Face_Recognition_CVPR_2019_paper.html [38] Pervaiz, U., Nazir, M., & Mahmood, A. (2019). Real-time convolutional neural networks for emotion and gender classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 0-0). https://openaccess.thecvf.com/content_CVPRW_2019/html/CEFRL/Pervaiz_RealTime_Convolutional_Neural_Networks_for_Emotion_and_Gender_Classification_CVPRW_2019_paper.html [39] Zhang, J., Wu, S., Zhu, Y., & Kumar, B. V. K. V. (2019). Facial Landmark Detection Using Multi-Scale Residual Network. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV) (pp. 1048-1056). https://ieeexplore.ieee.org/document/8659167 [40] Zhang, K., Zhang, Z., Li, Z., &Qiao, Y. (2016). Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23(10), 1499-1503. https://ieeexplore.ieee.org/document/7553523 [41] Zhang, Y., Liu, F., Chen, Y., Tong, X., & Zhang, L. (2018). Lightweight and efficient convolutional neural networks for mobile face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 2055-2060). https://openaccess.thecvf.com/content_cvpr_2018_workshops/papers/w5/Zhang_Lightweight_and_Efficient_CVPR_2018_paper.pdf [42] Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001 (Vol. 1, pp. I-511). IEEE. [43] Dalal, N., &Triggs, B. (2005). Histograms of oriented gradients for human detection. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\'05) (Vol. 1, pp. 886-893). IEEE. https://ieeexplore.ieee.org/document/1467360 [44] Schroff, F., Kalenichenko, D., & Philbin, J. (2015). FaceNet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 815-823). https://openaccess.thecvf.com/content_cvpr_2015/html/Schroff_FaceNet_A_Unified_2015_CVPR_paper.html [45] DeepID3: Face Recognition with Very Deep Neural Networks. Y. Sun, X. Wang, and X. Tang. In Proceedings of the IEEE International Conference on Computer Vision, pages 3385-3392, 2015. [46] Zhang, Z., Luo, P., Loy, C. C., & Tang, X. (2016). Learning deep representation for face attributes in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3730-3738). https://openaccess.thecvf.com/content_cvpr_2016/html/Zhang_Learning_Deep_Representation_CVPR_2016_paper.html [47] A Fast and Accurate System for Face De tection, Identification, and Verification. K. Zhang, Z. Zhang, Z. Li, and Y. Qiao. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1866-1875,
Copyright © 2023 Shalini Singh, Abhishek Singh Chauhan. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET51708
Publish Date : 2023-05-06
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here