Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Ananya Sharma, Nishtha Jain, Akshit Singh, Shanila , Sunil Kumar
DOI Link: https://doi.org/10.22214/ijraset.2022.42291
Certificate: View Certificate
The irritation established by the Covid sickness has caused a lot of destruction all around the world. Extremely fast dispersal of Covid has gagged just as imperiled the human development. Different cycles of research have focused on the clinical inclination for the finding of Corona Virus illness. Detection of covid prior of becoming dangerous is important but with the growing numbers it becomes difficult to manage it. Antigen and PCR tests which is reverse Transcription Polymerase Chain Reaction (RT-PCR) strategies for looking at the sickness have assumed a tremendous part in deciding the illness since the lively rise of COVID-19. The transmission of different freaks created in these two years has put a few inquiries on the accuracy and precise consequence of the business as usual. Along these lines, different radiological techniques like Computed Tomography (CT sweep) and X-beam, and so forth are being executed for the spotting of sickness due to the more fastidious outcome. With less resources many people have to face bad consequences and because of this reason many people lost their lives there is an emergent requirement for a fast solution so that detection is faster hence faster treatment. The investigatory review in the given report is about the utilization of Artificial Intelligence in hypothesizing the Covid infection utilizing an appropriate model. The most extreme precision accomplished among man-made reasoning classifiers adjusted in the given report is almost all the way. Along with detecting chest infection using X-Rays an application is needed by the people across the world to get to know whether they have covid or not on the basis of symptoms, early stage covid detection could be done easily by studying the symptoms so a model is required to detect whether a person is infected or not based on the symptoms user will fill.
I. INTRODUCTION
Covid-19 by sars-cov-2 i.e. severe acute respiratory syndrome corona virus is a urgent global concern. It spread so rapidly and it is highly infectious. The disease was declared as the global pandemic by the WHO as it was spread across 100 countries by 2020.although it affects mostly the entire population, the young people have the symptoms like headache and fever. for non-young people have symptoms like diarrhea, dyspnea, pneumonia and death. This disease is a communicable disease which can easily spread from asymptomatic to vulnerable population. Many countries have introduced isolation measure as one of the way to stop the spread of this disease. Isolation measures includes social distancing and lockdown, as the spread of covid-19 was very rapid and this cause tremendous pressure on medical staff. As we have only limited RT-PCR kits and all the resources. RT-PCR is used for the diagnosis of covid-19. This test can be done only in laboratories and the cost of this test is also very high because of the high rates of equipment and the required PCR agent and this makes the financial cost of the laboratory kits is very high and this become a significant issue for the developing and under developing countries. Kits like RT-PCR are comparatively costly for adverse society and also for confirmation takes almost 9-10 hours detecting whether patient is infected or not. Reasons like low sensitivity offered by RT-PCR tests, which gives high rate of false negative result and in consideration with health issues it is mandatory to have accurate results efficiently. So there is an alarming need to look for solutions for this issue as major usage are the radiological techniques for imaging like X-rays and Computed tomography (CT) in order to diagnose COVID-19. In this research paper chest X-rays are preferred over CT scan. The reason for this choice is its availability which is high in hospitals also they are cheaper than CT Scan machineries. COVID-19 have some mapping radiological signatures which allows easy detection via X-rays specifically chest, radiologist need to study the signatures received from these X-Rays. But this is time consuming task and in the worst situations when millions of people were infected consulting radiologists was a deal and in the upcoming years this could even worsen if the situation is not controlled. This approach applied to chest X-rays is limited. Therefore this paper comprises of modelling a deep learning based model to automatically detect whether the X-Ray received is of an infected person or not. Comparative analyses show that the proposed model performs significantly better as compared to the existing models. So, there become a need for the method that can predict whether a person is infected or not and this method should be affordable to all the people worldwide. Therefore, The method to detect COVID-19 using chest X-ray will prove to be really helpful for under developed or developing countries majorly where hospitals have few resources available and might be unable to purchase costly laboratory kits for such a large population or that cannot afford large number of CT Scan machines.
It is crucial as, currently, there is no such effective and accurate treatment found ,and hence diagnosis which should be effective is required. Along with the detection of chest infection through X-RAY this paper includes the making of additional symptom model which is used to detect whether a patient is infected or not just by taking symptoms from the user and according to those symptoms.
II. LITERATURE REVIEW
Different analysts have been performing extensive techniques and examination happened for the forecast of novel Covid sickness utilizing Artificial Intelligence, a recent survey conducted by Wanng all in this field showed the use of the Semantic Network component used on Inception Net was 82.9% accurate. One more exploration review broadcasted for the disclosure of the sickness utilized arrangement dependent on different stages by gathering X Ray pictures of individuals experiencing COVID, individuals with solid lungs, and recovering who experienced hypo gooey respiratory disease (pneumonia), this examination is yielded by Sang al handling Net, a well prepared model , came about with rightness and exactness of 93%. A crude model named Xray which was prepared, and at first revealed exactness was 82% and a while later, the precision boost upto 93.5% by crossing the crude design which is additionally a pre-prepared organization . Apart from the expectation of Coronavirus cases these organizations were consistently being used for anticipating any wellbeing issue like an examination was directed by Grawal al which utilized r-net i.e a thick net that gave a precision of 81% to diagnose the discharge in the encephalon utilizing PC helped tomography filtered pictures . Among the 3 ANN utilized in the application to address issues looked by clinical unit Convolutional Neural Network has shown better precise outcomes. In China approximately 100 cases was noticed for intensive finding in the associating join between the X Ray pictures and pneumonia brought about by Covid by Zho al . For separating between COVID, pneumonia, and Influenza, Mr. Xu played out a counterfeit profound learning-based exploration. An investigatory work was finished by Amyr for diagnosis cellular breakdown in the lungs, pneumonia, and COVID contaminated patients by picture order and division. Utilizing the packed organization model, Paul Cinelli et al. revealed an affectability of 85%. Notwithstanding utilizing Xray pictures J. al, have utilized ultrason logical pictures of the lungs to know each conceivable way and technique to analyze and explore the disease brought about by the infection, they have utilized P-Net which was joined by VG-16 and finished up with around 0.96 affectability. The affectability saw by A al was around 96% utilizing a 3-layered neural organization and Unet using 630 PC helped tomography examine meta information. As per investigatory work performed by T.A. al utilizes cyclic learning rate for observing disparity among overwhelming and contaminated lung Xray pictures. One more examination performed by J. al utilized Net for COVID-19 grouping. The review performed by M.J al for diagnosing Coronavirus infections utilizing upgrade speculation and Xray pictures showed an exactness of around 84%, utilizing an adjusted form of VGG-19 pre-prepared model. A great deal of exploration laborers bountiful a ton of work, information and data in these 2 years over the COVID-19, one more examination that was performed by X.H al for expectation of crown tainted patients and solid patients utilizing self-capable exchange model, showed an exactness of 93%, an exact data for diagnosing the contamination. While anticipating the sickness radio envisioning techniques were seen as ordinary and for more exact outcome, a correlation of X-beam pictures with PC helped tomography filters for diagnosing the infection showed that a precise total can be accomplished by inspecting Xray pictures, we may be equipped for perceiving the disease at an underwriting level or no indication by any means.
III. PROPOSED METHODOLOGY
This paper comprises of two models one which is used to classify whether the person has Covid or not on the basis of symptoms and another model detects the chest infection in the provided X-Ray image.
A. Model 1
The model's goal is to classify a given chest X-ray image as normal or COVID-19, which includes two important stages: pre-processing and classification using our own trained model. To improve Accuracy as in Medical terms we have to ensure right result we will train on pre trained CNN architectures and fetch the best model to predict results. COVID-19 was identified by analyzing the X-ray images. Initially model converted images from RGB to grayscale and then the region of interest (ROI) is identified via removing unwanted parts. X-ray pictures that are distorted and of poor quality as well as images of good quality were used in our testing. If training and testing were to be carried out only with chosen radiographic images of good quality, then the output precision would be higher. But, this does not represent a realistic scenario, where the data of images are a mixture of all qualities images the good and the bad ones as our model should be intelligent enough to tackle all images quality. Hence, this approach to use images of varying quality will allow to test the performance of the model in realistic scenarios.
The model for the detection of COVID-19 from chest X-rays consisted of major and essential steps i.e. data collection, data pre-processing, dataset categorization, model training and model evaluation and analysis.
The dataset required to train and verify the model is to be collected and categorized. Source for data [1] .The data was collected and arranged properly dividing into test and train data with subfolders as Covid and Normal images. 225 images were taken for Covid and Normal images both. As a result, the patients' chest X-rays were obtained and then stored in a database. Divided into two categories: Covid-19 positive and negative. The dataset which is used in the experiment comprised of two categories i.e Covid-19 positive and negative chest X-rays. The database contains a variety of image sizes ranging from 511 × 511 pixels to 658 × 658 pixels. The images are in greyscale and RGB format, and RGB image is converted into grayscale image. To convert an RGB image to greyscale, the relationship is as follows:
I = (W(r). *R) + (W(g). *G) + (W(b).* B) is used.
W(r), W(g), and W(b) are weights of red, green, and blue colors, each with a value of 0.3, 0.5, and 0.1, and there sum is equal to 1.
Moreover, the images were saved in PNG and JPG formats, with 8-bit (grayscale) and 24-bit (colored) bit depths. Because the databases had different images sizes, formats, and bits-depths, they were converted to 224x224 pixels with grayscale images which are 8 bit and then converted and saved in PNG format.
This data taken is for training purpose when user will upload their X-Ray they could be in another format which have to be converted to required size and format
2. Data Pre-processing
To ensure uniformity in the data, the collected data is resized and blended into a standardized format. It is normalized to a constant shape. This is done by eliminating noisy or deformed pixels from every image, image processing is a key step in acquiring important and useful information as well as correct classification. Areas of Interest (AOI)/(ROI) were extracted in order to train and test in and to remove text and automatic captions around photos. The AOI on chest radiographs(XRAY) is determined by the area that mostly encompasses the region of the lung in order to gather the necessary information. First, a rectangle is used to define the AOI, and then constructed a mask from the rectangle. The area outside the AOI was set to 0 using logical indexing, and the extract was shown. The graphic below depicts photos at various stages of pre-processing.
In the original image, for example, useless symbols (checkmark in image pf normal chest x-ray) or text (B in covid19 image) were deleted during ROI step. Because the photos utilized in this study were gathered from various sources, their quality, size, and inherent noise may vary. As a result, the pre-processing methods utilized will normalize all of the photos so that they are independent of one another, and the size of the images will have no effect on system performance.
3. Categorization of the Data
After data pre-processing, the data is categorized based on the classification of the model. The images were classified into two categories as either Covid-19 positive or normal. Firstly folders for training and testing data are divided along with the validation dataset which was further divided. The dataset utilized in our experiment consisted of two main categories which are Covid-19 positive and chest X-rays images of normal patients.
4. Training the Model
In the next step, training of all models is initiated and validated with the same environment and dataset. Initially this was tested by using pretrained models after validating with other models and with the own CNN model which gave better accuracy. The model is defined below.
CNN Modeling
CNN has played a large role in the classification of images, especially in the medical field CNN has big impact. This has opened up new opportunities and made identification of medical issues much easier not only with image classification but also other feature classification. It also detected the recent new Covid 19 with greater precision
Convolution Neural Network (CNN) a deep learning network design. CNN is a deep neural network mostly used to classify images, group them based on similarity, and recognize objects in a scene. CNN is made up of one or more Convolution layers, and it learns directly from images, just as one or more fully linked layers in a typical multilayer neural network. Image analysis tasks such as classification, target identification, segmentation, and image processing can be taught to a CNN.
This whole process consists of three layers
a. Convolutional Layer: A convolutional layer[3] is made up of many number of filters whose parameters must have to be learned so that model can be trained effectively. Each filter convolution to the volume to generate a neuron activation map. Height and weight of filters are less than that of the input volume. The output volume of the Convolution layer is calculated by stacking the activation maps of all filters along the depth dimension.. The main objective of the operation is to extract the high-level characteristic features from the input image, such as edges Filters are set to 32 for first convolutional layer then increased to 64 and then to 128 filters with each layer.
b. Pooling Layer: The Pooling layer[4],is like the Convolutional Layer, does the spatial size reduction of the Convolved Feature. Reductions in the dimensionality which is the computer power requirement to processing resulting in its reduction in usage. Pooling layer is mainly of two types: average pooling and max pooling. Max Pooling returns the value maximum from the Kernel-covered portion of the image. Average Pooling, whereas returns the averages of all the values from the portion of the images covering by the Kernel. The ith layer of a CNN is made up of the Convolutional Layer and the Pooling Layer. There are three maximum pooling layers added.
c. Fully Connected Layer: Final layer of CNN Fully Connected layers[4] in a neural networks are the layers where all the inputs from one layers are connecting to every activation unit of the other upcoming layers.
5. Hyper Parameters Tuning
In convolutional layers, the kernel size is 3x 3 with an optimization kernel activation RELU activation function [5] and padding that is the same. The pool size in Max pooling layers is 2x2. The dropout layer randomly drops a certain number of neurons in a layer. What percentage of neurons to drop is set in the dropout rate. The Dropout rate in the first two first three dropout have rate of 0.25 and for contraction block is 0.5. Transposed convolutional layers use kernel size of 3 x 3. Below Table 1 shows the layers.
6. Evolution on Validation Data
Lastly, we evaluate the trained models based on a few key metrics such as accuracy, recall, precision.
Classification accuracy = TP + TN / TP + TN + FP + FN
Here TP-True Positives, FP- False Positives , FN- False Negatives, and TN- True Negatives.
In a confusion matrix, COVID-19 positive cases that were correctly classified by the model were called true positives and incorrectly classified as COVID negatives were called false positives. Likewise, COVID negative subjects that were correctly classified as true negatives and incorrectly classified as COVID were qualified as false negative.
Calculating 95% accuracy is obtained on training dataset.
B. Model 2
This model will take the symptoms as input from the user and will detect whether it is covid or not. This will be an add on to the accuracy of the detection application and will be very helpful in diagnosis. For users with early stage symptoms and those who have not yet their X-rays done, this could be really beneficial to detect and find better solutions. The usual symptoms of covid-19 are high temperature, coughing, cold and scratchiness in the throat. In addition to these symptoms, diarrhea, hearing problems, loss of smell, chest pain, and nasal clogged nose may also occur. Major symptoms could be having difficulty in breathing.
We have created a symptom database in which rules are generated and used as input. This data is then used as raw data. Then the selection of features takes place as part of the data pre-processing. The data has been split into training data which is 70% and testing data which is 30%, commonly known as train test split process.
Source of data[6]
2. Pre-processing
Raw data is converted into processed one by using libraries like pandas, NumPy, matplotlib and seaborn. Pandas and NumPy are used to process data into required tabular format whereas matplotlib is used to represent graphical scenarios of the data prepared. The Raw data is converted into csv format which is then processed by dropping columns which are not required. Itertools is imported and is used to convert the final data by using product function and convert into dataframe. Symptoms are then divided into separate columns with the last column as Covid result. Dropping unwanted columns and then removing duplicate and missing values. Symptoms used are - Problem in breathing, high temperature , Dry Cough, Scratchiness in the throat, Hypertension, recent travel to abroad, proximity with any covid patient , attended large gathering , visited in public exposed places, family visited in public exposed places.
3. Data Pre-processing
Data Processing is the most essential part of any machine learning based application. To prepare an application of machine learning, the data must be properly processed before sending to the system.
The Python framework of machine learning was used for this data processing phase in this proposed model.
Python libraries and modules used are pandas, NumPy, sklearn, matplotlib and seaborn.
Before training the data it is important to understand the data this could be done by various methods best way is to use matplotlib and seaborn to graphically understand the data. Below is the example for the same this is the graphical representation of the breathing problem in covid patients and the normal ones this will help us understand data and change it to a better data because do not want our model to overlearn any one classification and overfit or underfit.
4. Training the Model
After pre-processing and splitting the dataset into training set and testing set model will be trained Logistic Regression Model[7] - Logistic regression is supervised learning model. The categorical variables used in the logistic regression algorithm are Yes or No, 0 and 1,True or False. The values could be either binary or discrete depending on the input provided.
It is a predictive analysis technique majorly based on the probability notion.
The sigmoid function, a cost function which is complex also known as logistic function used in logistic regression. In logistic regression, this sigmoid function is used to model the data.. The function is be represented as -
Fun(x) = 1 / (1 + e^(-x))
where Fun(x) = output between 0 and 1 value
x = input to the function
e = base of natural logarithm
Other models were also tested and with cross validation the best results were given by Logistic Regression.
The system then will predict the covid-19 on the basis of logistic regression. From sklearn logistic regression model was used and the data is fed to the model before this symptoms were kept under independent variables and the covid column was under dependent variable then the training was done on the test and training data.
5. Result and Evaluation
Based on the dataset, model has given the 94% accuracy. Confusion matrix is calculated for analyzing the model.
Output -
Array ([[ 90, 12],
[ 15, 427]], dtype=int64)
Calculating accuracy
(90+427)-(15+12)/(90+427)*100
94% accuracy is obtained.
The model was now tested on test data and the predictions were given based on that.
6. Web Services
This gives a purpose for our model to be used in practice. As we have to make our model accessible to a lot of people we need to make it easy to use and imperative.
A fully fledged website is made to make it user usable To make it happen Django framework is used.
IV. RESULTS AND DISCUSSION
Through this paper we have tried to solve the issue of detecting covid in an effective and affordable wat by proposing a deep learning model and also a machine learning model to detect and classify COVID-19 cases from X-ray images and symptoms respectively. These models are automated such that requires no need for manual feature extraction and automatically image will be featured accordingly also along with this images formats will be automatically adjusted. The system we developed is able to rightly perform binary extraction with accuracy of 95% on model one and binary classification on model 2 with 94% accuracy. The performance and the accuracy of the model developed is assessed by expert radiologists and doctors and is ready to be tested with a very large and broad database with variety of images from different sources. This application can be used from anywhere to anywhere useful specially in countries affected adversely by COVID-19 in order to overcome growing shortage of radiologists and will also help in early diagnosis and thus early stage treatment with the help of symptom checker. Also, not only detecting covid this model can also be used in diagnosis of other chest related problems including pneumonia and TB.
Major limitation of this study is used number of COVID- 19 X-ray images as they were limited and more the types of images more intelligent our system will result to. With intentions to make our model more accurate more data is taken from our local hospitals such that more training could be done using better quality images and also vast variety. More study have to be made on Symptoms to better train the model understanding the major dependency is important to give accurate results. RT PCR and antigen assessments are surpassed directly to decide and stop end result that whether or not or now no longer the disorder is accessible, every now and then are hard to manipulate to get the veritable end result. Hence, estimate of disorder is time taking and once in a while even misguided. Different strategies and evaluation are being held and have to be made via way of means of studies employees, academicians, specialists, and so forth over the complete circle for this pandemic associated trouble as proven in element 2 composing review. Introducing diverse computations for the estimate of disorder would possibly keep the time simply as will assist the medical attention employees to keep extra lenient lives the use of those procedures. Expecting the disorder the use of X-rays appears at and the computations stated with inside the paper are extra trustable. For the destiny justifications in the backup of perceiving the presence of ailment the version is cautious and specific if it\'s miles to be differentiated and antigen assessments and following pcr, Inverting Records Polymerase Chains Responses assessments. Inconveniences and demanding situations tested natural, and prescription associated subject may be gotten the use of this version via way of means of the use of PC information and saving hundreds of laborious work and time.
[1] Joseph Paul Cohen, https://github.com/ieee8023/COVID-chestxray-dataset, [2] Laith Alzubaidi, Jinglan Zhang, Amjad J. Humaidi, Ayad Al-Dujaili, Ye Duan, Omran Al-Shamma, J. Santamaría, Mohammed A. Fadhel, Muthana Al-Amidie & Laith Farhan, Review of deep learning: concepts, CNN architectures, challenges, applications, future directions, https://journalofbigdata.springeropen.com/articles/10.1186/s40537-021-00444-8 , Published on 31 March 2021 [3] Jason Brownlee,A Gentle Introduction to Pooling Layers for Convolutional Neural Networks, https://machinelearningmastery.com/rectified-linear-activation-function-for-deep-learning-neural-networks, Published on 9 January 2019. [4] Hemant Hari, symptoms and covid presence , Published on May 2022. 5. Fan Wu, Su Zhao, Bin Yu, Yan-Mei Chen, Wen Wang, Zhi-Gang Song, Yi Hu, Zhao-Wu Tao, Jun-Hua Tian, Yuan-Yuan Pei, Ming-Li Yuan, Yu-Ling Zhang, Fa-Hui Dai, Yi Liu, Qi-Min Wang, Jiao-Jiao Zheng, Lin Xu, Edward C. Holmes & Yong-Zhen Zhang, A new coronavirus associated with human respiratory disease in China, https://www.nature.com/articles/s41586-020-2008-3, Published on 03 February 2020 [5] Dongsheng Ji, Zhujun Zhang, Yanzhong Zhao, and Qianchuan Zhao, Research on Classification of COVID-19 Chest X-Ray Image Modal Feature Fusion Based on Deep Learning, https://www.hindawi.com/journals/jhe/2021/6799202, Published on 25 Aug 2021. [6] X. Xu, X. Jiang, C. Ma et al., “Deep learning system to screen coronavirus disease 2019 pneumonia,” 2020, https://arxiv.org/ abs/2002.09334.Academic Editor: Saverio Maietta,Published15 Mar 2021,Deep Learning in the Detection and Diagnosis of COVID-19 . [7] COVID-19 Detection from Chest X-ray Images,Published: 20 February 2021,Author Biographies Mominul Ahsan. [8] GuntherCorreia Bacellar, Mallikarjuna Chandrappa, Rajlakshman Kulkarni COVID-19 Chest X-Ray Image Classification Using Deep Learning.1 February 2021 [9] Rachna Jain,Meenu Gupta,Deep learning based detection and analysis of COVID-19 on chest X-ray images,Published on 09 october,2021. [10] Rabat Yasin ,Walaa Gouda Chest X-ray findings monitoring COVID-19 disease course and severity Publisher on 22 September 2020. [11] Ai T., Yang Z., Hou H., Zhan C., Chen C., Lv W., Tao Q., Sun Z., Xia L.Correlation of chest CT andRT-PCR testing for coronavirus disease 2019 (COVID-19) in China.Published on 26 February 2020. [12] Identifying medical diagnoses and treatable diseases by image-based deep learning.Daniel Kermany A.S.,Goldbaum M., Cai W., Anthony Lewis M., Xia H., Zhang Correspondence K [13] Jain G., Mittal D., Thakur D., Mittal M.K.A deep learning approach to detect Covid-19 coronavirus withX-Ray images Biocybernetics and Biomedical Engineering, 40 (4) (2020), pp. 1391-1405.Published on 7 September 2020. [14] Novel Coronavirus – China, World Health Organization, (2020). https://www.who.int/csr/don/12-january-2020-novel-coronavirus-china/en/. [15] Pillalamarry Mahesh, Yakkala Gnana Prathyusha, Botlagunta Sahithi and S Nagendram.Covid-19 Detection from Chest X-Ray using Convolution Neural Networks.Published on February 2021. [16] Boran Sekeroglu,Llker Ozsahin.Covid-19 Detection of COVID-19 from Chest X-Ray using ConvolutionNeural Networks. Published on 18 September 2020.
Copyright © 2022 Ananya Sharma, Nishtha Jain, Akshit Singh, Shanila , Sunil Kumar. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET42291
Publish Date : 2022-05-06
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here