Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Chandan S, Lohith S, Yamini G B, Nithin Gowda, Shruthi N
DOI Link: https://doi.org/10.22214/ijraset.2022.43727
Certificate: View Certificate
COVID-19 pandemic has rapidly affected our day-to-day life disrupting the world trade and movements. Wearing a protective face mask has become a new normal. In the near future, many public service providers will ask the customers to wear masks correctly to avail of their services. Therefore, face mask detection has become a crucial task to help global society. This paper presents a simplified approach to achieve this purpose using some basic Machine Learning packages like TensorFlow, Keras and OpenCV. The application of “machine learning” and “artificial intelligence” has become popular within the last decade. Both terms are frequently used in science and media, sometimes interchangeably, sometimes with different meanings. In this work, we specify the contribution of machine learning to artificial intelligence. We review relevant literature and present a conceptual framework which clarifies the role of machine learning to build (artificial) intelligent agents. The proposed method detects the face from the image correctly and then identifies if it has a mask on it or not. As a surveillance task performer, it can also detect a face along with a mask in motion. The method attains accuracy up to 95.77% and 94.58% respectively on two different datasets. We explore optimized values of parameters using the mobileNetV2 which is a Convolutional Neural Network architecture to detect the presence of masks correctly without causing over- fitting.
I. INTRODUCTION
Coronavirus disease (COVID-19) is a newly found coronavirus that causes an infectious disease.
The COVID-19 virus causes mild to moderate respiratory disease in the majority of people who are infected and recuperate without the need for extra care. People in their eighties and nineties, as well as those with underlying medical conditions
Cardiovascular disease, diabetes, chronic obstructive pulmonary disease, and cancer are all on the rise. There's a good chance you'll get sick.
The best way to prevent and slow down transmission is to be well informed about the COVID-19 virus, the disease it causes and how it spreads. Protect yourself and others from infection by washing your hands or using an alcohol based rub frequently and not touching your face. When an infected individual coughs or sneezes, the COVID-19 virus transmits predominantly through droplets of saliva or discharge from the nose, therefore respiratory etiquette is particularly vital.
Masks are an important tool for preventing transmission and saving lives. Physical separation, avoiding crowded, closed and close contact settings, sufficient ventilation, wiping hands, covering sneezes and coughs, and more should all be part of a complete do-it-all approach that includes masks.
Masks can be used to protect healthy people or to prevent forward transmission, depending on the type COVID-19 pandemic has rapidly affected our day-to-day life disrupting the world trade and movements. Wearing a protective face mask has become a new normal. In the near future, many public service providers will ask the customers to wear masks correctly to avail of their services. Therefore, face mask detection has become a crucial task to help global society.
II. CONCEPTS TO BE UNDERSTOOD MACHINE LEARNING
A. What is Machine Learning?
Machine learning is a branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy. Machine learning is an important component of the growing field of data science.
Through the use of statistical methods, algorithms are trained to make classifications or predictions, uncovering key insights within data mining projects.
These insights subsequently drive decision making within applications and businesses, ideally impacting key growth metrics. As big data continues to expand and grow, the market demand for data scientists will increase, requiring them to assist in the identification of the most relevant business questions and subsequently the data to answer them.
B. Machine Learning vs. Deep Learning
The way in which deep learning and machine learning differ is in how each algorithm learns. Deep learning automates much of the feature extraction piece of the process, eliminating some of the manual human intervention required and enabling the use of larger data sets. You can think of deep learning as "scalable machine learning". Classical, or "non-deep", machine learning is more dependent on human intervention to learn. Deep learning (also called deep machine learning) can leverage labeled datasets, also known as supervised learning, to inform its algorithm, but it doesn’t necessarily require a labeled dataset. It can ingest unstructured data in its raw form (e.g. text, images), and it can automatically determine the set of features which distinguish different categories of data from one another.
Unlike machine learning, it doesn't require human intervention to process data, allowing us to scale machine learning in more interesting ways
We break out the learning system of a machine learning algorithm into three main parts:
Machine learning classifiers fall into three primary categories: ?
a. Supervised Machine Learning: Supervised learning, also known as supervised machine learning, is defined by its use of labeled datasets to train algorithms that to classify data or predict outcomes accurately.
b. Unsupervised Machine Learning: Unsupervised learning, also known as unsupervised machine learning, uses machine learning algorithms to analyze and cluster unlabeled datasets.
c. Semi-supervised Learning: Semi-supervised learning offers a happy medium between supervised and unsupervised learning.
Here are just a few examples of machine learning you might encounter every day: ?
It is also known as automatic speech recognition (ASR), computer speech recognition, or speech-to-text, and it is a capability which uses natural language processing (NLP) to process human speech into a written format. Many mobile devices incorporate speech recognition into their systems to conduct voice search—e.g. Siri—or provide more accessibility around texting.
Online chatbots are replacing human agents along the customer journey. They answer frequently asked questions (FAQs) around topics, like shipping, or provide personalized advice, cross-selling products or suggesting sizes for users, changing the way we think about customer engagement across websites and social media platforms. Examples include messaging bots on e-commerce sites with virtual agents, messaging apps, such as Slack and Facebook Messenger, and tasks usually done by virtual assistants and voice assistants.
This AI technology enables computers and systems to derive meaningful information from digital images, videos and other visual inputs, and based on those inputs, it can take action. This ability to provide recommendations distinguishes it from image recognition tasks. Powered by convolutional neural networks, computer vision has applications within photo tagging in social media and self- driving cars within the automotive industry.
III. SOFTWARES
A. PyCharm
B. Anaconda
C. Incorporated packages
IV. METHODOLOGY
We are using two stage detectors to detect the face and apply the model. two-stage detectors follow a long line of reasoning in computer vision for the prediction and classification of region proposals. They first predict proposals in an image and then apply a classifier to these regions to classify potential detection. Various two-stage region proposal models have been proposed in past by researchers. Region-based convolutional neural network also abbreviated as R-CNN described in 2014 by Ross Girshicket al. It may have been one of the first large-scale applications of CNN to the problem of object localization and recognition.
The model generated state-of-the-art results when tested on benchmark datasets such as VOC-2012 and ILSVRC-2013. R-CNN uses a selective search technique to extract a set of item proposals at first, and then uses an SVM (Support Vector Machine) classifier to forecast objects and associated classes subsequently. SPPNet takes features from several area suggestions and feeds them into a fully connected layer for classification (modifies R-CNN with an SPP layer).
The ability of SPNN to construct feature maps of the entire image in a single shot resulted in a nearly 20-fold increase in object detection time over R-CNN. Then there's Fast R- CNN, which is a combination of R-CNN and SPPNet. To fine-tune the model, it adds a new layer called Region of Interest (ROI) pooling layer between shared convolutional layers. It also allows you to train a detector and a regressor at the same time without having to change the network setups.
The process will end by pressing the q button.
V. ALGORITHM
A. Input
Dataset including faces with and without masks OUTPUT: Categorized image depicting the presence of face mask
VI. WORKING
A. Data Processing
Data preprocessing involves conversion of data from a given format to much more user friendly, desired and meaningful format. It can be in any form like tables, images, videos, graphs, etc. This organized information fit in with an information model or composition and captures relationship between different entities. The proposed method deals with image and video data using Numpy and opencv.
B. Data Visualization
Data visualization is the process of transforming abstract data to meaningful representations using knowledge communication and insight discovery through encodings. It is helpful to study a particular pattern in the dataset
The total number of images in the dataset is visualized in both categories – ‘with mask’ and ‘without mask’. This categorizes the list of directories in the specified data path. The variable categories now look like: [‘with mask’, ‘without mask’]
Now, each category is mapped to its respective label using which at first returns an iterator of tuples in the form of zip object where the items in each passed iterator is paired together consequently. The mapped variable looks like: {‘with mask’: 0, ‘without mask’: 1}
C. Image Reshaping
The input during relegation of an image is a three- dimensional tensor, where each channel has a prominent unique pixel. All the images must have identically tantamount size corresponding to 3D feature tensor. However, neither images are customarily coextensive nor their corresponding feature tensors. Most CNNs can only accept fine- tuned images. This engenders several problems throughout data collection and implementation of model. However, reconfiguring the input images before augmenting them into the network can help to surmount this constraint.
The images are normalized to converge the pixel range between 0 and 1. As, the final layer of the neural network has 2 outputs – with mask and without mask i.e. It has categorical representationthe data is converted to categorical labels.
D. Building the model using CNN architecture
CNN has become ascendant in miscellaneous computer vision tasks. The current method makesuse of Sequential CNN. The First Convolution layer is followed by Rectified Linear Unit (relu) and maxpooling layers.
The Convolution layer learns from 128 filters. Kernel size is set to 7 x 7 which specifies the height and width of the 2D convolution window. As the model should be aware of the shape of the input expected, the first layer in the model needs to be provided with information about input shape. Following layers can perform instinctive shape reckoning. Default padding is “valid” where the spatial dimensions are sanctioned to truncate and the input volume is non- zero padded. The activation parameter to the Conv2D class is set as “relu”. It represents an approximately linear function that possesses all the assets of linear models that can easily be optimized with gradient- descent methods. Considering the performance and generalization in deep learning, it is better compared to other activation functions. Max Pooling is used to reduce the spatial dimensions of the output volume. Pool_size is set to 7 x 7 To reduce overfitting a Dropout layer with a 50% chance of setting inputs to zero is added to the model. The final layer (Dense) with two outputs for two categories uses the Softmax activation function.
E. Splitting the data and training the CNN model
After setting the blueprint to analyze the data, the model needs to be trained using a specific dataset and then to be tested against a different dataset. A proper model and optimized train_test_split help to produce accurate results while making a prediction. The test_size is set to 0.2 i.e. 80% data of the dataset undergoes training and the rest 20% goes for testing purposes. The validation loss is monitored using modelcheckpoint. Next, the images in the training set and the test set are fitted to the Sequential model. Here, 20% of the training data is used as validation data. The model is trained for 20 epochs (iterations) which maintains a trade-off between accuracy and chances of overfitting. With our deep learning models now in memory, our next step is to load and pre-process an input image Upon loading our image from disk, we make a copy and grab frame dimensions for future scaling and display purposes. Pre-processing is handled by opencv’s blob from image function. As shown in the parameters, we resize to 224×224 pixels and perform mean subtraction.
Once we know where each face is predicted to be, we’ll ensure they meet the threshold before we extract the face ROI We then compute bounding box value for a particular face and ensure that the box falls within the boundaries of the image
The face ROI through our mask-net model in this block, we:
Extract the face ROI using numpy Slicing Preprocess the ROI the same way we during training Perform mask detection to predict with mask or without mask. From here, we will annotate and display the result First we determine the class label based on probabilities returned by the mask detector model and then assign an associated color for the annotation the color will be “green” for with mask and “red” for without mask. We draw the label text (including class and probability), as well as a bounding box rectangle for the face, using opencv drawing functions
VII. EXPECTED RESULT
As you can see in the graph obtained from training the Model as the Epochs are increased the accuracy of the Model is increased and when we use the model we will Get an accuracy of upto 95% of whether the person has The mask or not.
In the proposed project, as discussed above, here we can see the bounding box along with the annotations is visible and the model is detecting that the person is wearing a mask or not with accuracy in the below figure.
A. Application
VIII. FUTURE OBJECTIVE FUTURE OBJECTIVE
Integrate the proposed model with public CCTV cameras to detect people not wearing masks in public places. For this we need wide range of dataset in order to train the model. Integrate the proposed model with thermal sensor cameras to detect both temperature of the person inside the frame and also detect whether the person is wearing mask or not.
The pandemic is not yet over and the government has lifted the lockdown and soon schools, colleges and offices will reopen. In these times we have to constantly monitor if everybody has worn their masks and are practicing social distancing. The proposed project with a few modifications can help battle the covid-19 and monitor people in public places and thus reduce the spread of this dangerous virus.
[1] Author: arjya das, mohammad wasif Ansari- https://ieeexplore.ieee.org/document/9342585 [2] Author: samuel adu sanjaya, suryo adi rakhmawan- https://ieeexplore.ieee.org/document/9325631 [3] Author: Adrian rosebrock- https://www.pyimagesearch.com/2020/05/04/covid-19-face-mask-detector-with-opencv-keras-tensorflow-and- deep-learning [4] Author: guru charan MK- https://towardsdatascience.com/covid-19-face-mask- detection-using--tensorflow-and-opencv-702dd833515b
Copyright © 2022 Chandan S, Lohith S, Yamini G B, Nithin Gowda, Shruthi N. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET43727
Publish Date : 2022-06-02
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here