Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Tanushka Bansal
DOI Link: https://doi.org/10.22214/ijraset.2022.42925
Certificate: View Certificate
Artificial intelligence (AI) has been widely used in many sectors like Agriculture and Farming, Autonomous Flying, Security and Surveillance, Clinical Medicine etc, and one such important sector is healthcare where there is a tremendous increase in innovations in the fields of AI.Medical facilities need to be really advanced so that better decisions can be made for patient diagnosis and treatment options. Machine learning in healthcare helps humans to process huge and complex medical datasets and then analyse them into clinical insights. This can be later used by the physicians in providing suitable medical care. Hence machine learning and artificial intelligence when implemented in healthcare can lead to increased patient satisfaction. Disease Prediction using AI is the system that is used to predict the diseases from the symptoms which are given by the patients or any user. The system then processes the symptoms provided by the user be it image or details as an input and gives the required output depending upon the probability of the disease. With an increase in biomedical and healthcare data, accurate analysis of medical data benefits early disease detection and patient care. By using this, we are predicting diseases like Diabetes, Malaria, Heart disease and many more.
I. INTRODUCTION
Predicting a disease based on a patient's treatment history and medical profiles through the application of machine learning techniques has been an ongoing struggle for several decades. Some approaches attempt to make predictions about disease control and progression. Several investigations have been conducted in which features were automatically selected from a large amount of data to improve the accuracy of risk classification instead of previously selected features. The primary focus is on the use of artificial intelligence techniques in healthcare to complement patient health care and achieve better results. Artificial intelligence models have made it easier to recognize various diseases and diagnose them accurately.The healthcare industry produces large amounts of healthcare data every day that can be used to extract information to predict diseases that a patient may experience in the future while using treatment history and healthcare data. This information hidden in the health data is later used for effective purposes. Decision-making for the health of the patient. Healthcare organisations need to evolve to make better decisions about patient diagnoses and treatment options. Machine learning in healthcare helps people process huge and complex medical data sets and then analyse them into clinical insights. It can also be used by doctors for medical care.Therefore, machine learning, when used in healthcare, can lead to higher patient satisfaction. Machine learning is the domain that uses previous data to make predictions. Machine learning is the understanding of computer systems where the machine learning model learns from data and experiences. implemented using certain predictive machine learning algorithms, healthcare can be made intelligent. Some cases may occur when early diagnosis of a disease is unattainable or not possible at all.Therefore, disease prediction can be effectively implemented.
As has rightly been said, "prevention is better than cure", disease prediction would ultimately lead to early prevention of disease occurrence. This document mainly focuses on the development of a system or we could say an immediate medical care platform that would take the collected symptoms and other medical data of the user and predict the disease at maximum probability. This model is being implemented using the random forest algorithm by deep diving into the deep learning model that is naive bayes that acts upon the data being added by the user to get the actual result with utmost accuracy. Furthermore, a convolution neural network algorithm is used whenever any image or report related thing comes into action for getting the actual result.
II. RELATED WORK
The existing system predicts the chronic diseases which are for a selected region and for the actual community. Solely particular diseases are expected by this system. During this system, massive knowledge & CNN rule is employed for unwellness risk prediction. They experiment with the changed prediction models over reallife hospital data collected from different medical sites. They propose a convolutional neural network-based multimodal unwellness risk prediction(CNN-MDRP) rulemistreatment structured and unstructured knowledge from the hospital.
[2] Ursula Schmidt-Erfurth ,Sebastian M. Waldstein, Sophie Klimscha,Amir Sadeghipour ,Xiaofeng Hu , Bianca S. Gerenda, Aaron Osborne and Hrvoje Bogunovi? predicted a chronic Disease called age-related macular degeneration (AMD) of the eye. For this they built an AI model to predict the disease by using optical coherence tomography images of the patient. By this model they were able to make automated analysis of imaging biomarkers to predict the disease.
[5] Naresh Kumar ,Nripendra Narayan Das,Deepali Gupta ,Kamali Gupta and Jatin Bindra showed and made an effective model to predict Diseases like heart disease, diabetes and coronavirus by implementing Machine Learning as well as Artificial intelligence into the model. For that they built an android app where the data was input by the user, using real time database. The results were shown in the app. The analysis was done by using ML technique logistic regression.
[5] It was seen that Palak Agarwal, Navisha Shetty, Kavita Jhajharia, Gaurav Aggarwal, Neha V Sharma used linear regression as the the best machine learning technique to predict life expectancy for various diseases as well as predicting which disease was more prominent in various continents.
III. METHODOLOGY
The First approach of any ML model is to find the datasets that is the backbone of the ML projects. After choosing the ML model the next target is to analyse the best possible ML algorithms. The algorithms are mapped on the basis of their accuracy score and also their boosted versions which are mapped out on mathematical formulas such as calculating ROC curve. For any ML model the accuracy score plays a key role to depict the efficiency of the model.
The goal of the project here is to predict the 9 chronic diseases such as coronavirus, coronary heart disease, liver, kidney, malaria, pneumonia, parkinsons, lung cancer and diabetes in a person primarily based on the data being provided or the images being uploaded by the user. If the user is suffering from common diseases he/she can input his previous health history in the web framework (flask). The output attained will be the common diseases others than the 9 chronic diseases which is mapped on the ML techniques such as random forest and naive Bayes. For performing the said ML and deep learning technique with machine configurations and software program: Python 3 and Flask are used and carried out the usage of Jupyter Notebook 5.5.0 on Intel(R) Core(TM) i3-2310M CPU @2.10 GHz with 8 GB RAM.
IV. PROPOSED SYSTEM
In this paper, we have used statistical data, which could determine the major chronic diseases in a person based on his/her symptoms. Most of the chronic diseases are predicted by our system. It accepts the structured type of data as input to the machine learning model. This system is used by end users i.e. patients/any user. In this system, the user will enter all the symptoms from which he or she is suffering. These symptoms then will be given to the artificial intelligence model to predict the disease. Algorithms are then applied to which gives the best accuracy. Then the system will predict disease on the basis of symptoms. This system uses Artificial IntelligenceTechnology, Random Forest Classification algorithm is used for disease prediction using symptoms, for classification, CNN algorithm is used. The end result of this system will be the disease predicted by the model.
V. MODEL AND SYSTEM ARCHITECTURE
VI. ML AND DEEP LEARNING TECHNIQUES USED
A. Random Forest Classifier Algorithm
This is a popular machine learning algorithm that belongs to the supervised learning method. It can be used for both ML classification and regression problems. It is based on the concept of ensemble learning, which combines multiple classifiers to solve complex problems and improve model performance. As the name implies, "Random forest is a classifier that takes a set of decision trees for different subsets of a particular dataset and then takes the average to improve the predictive accuracy of that dataset." Instead of relying on a decision tree, it gets predictions from each tree in Random Forest and predicts the final result based on the majority of the predictions. The greater the number of trees in the forest, the higher the accuracy and the avoidance of overfitting problems.
B. Convolution Neural Network
Convolutional neural networks are one of the major categories of image classification and image recognition in neural networks. Scene marking, object recognition, face recognition, etc. are some of the areas where convolutional neural networks are widely used. CNN takes an image as input. This image is processed into specific categories such as dogs, cats, lions and tigers. The computer recognizes the image as an array of pixels and depends on the resolution of the image. Displayed as h * b * d based on the resolution of the image. Where h = height, w = width, d = dimension. For example, an RGB image is a 6 * 6 * 3 array in a matrix, and a grayscale image is a 4 * 4 * 1 array in a matrix. On the CNN, each input image goes through a series of convolution layers along with pooling, fully connected layers, and filters (also known as the kernel). Then apply the softmax function to classify the objects by probability values ??0 and 1.
B. Naive Bayes Algorithm
It is a supervised mastering algorithm that is primarily based on Bayes theorem and used for fixing category problems. Naive Bayes is a clean but amazingly effective rule for prediction modelling. The independence assumption that lets in decomposing joint probability into a set of marginal likelihoods is known as 'naive'. This simplified Bayesian classifier is known as naive Bayes. The Naive Bayes classifier assumes the presence of a particular function in a category is unrelated to the presence of any other function. It may be very clean to construct and beneficial for large datasets. Naive Bayes is a supervised getting to know model. It is specially utilised in the textual content category that consists of a high-dimensional schooling dataset. Naïve Bayes Classifier is one of the easy and best Classification algorithms which allows in constructing the quick gadget mastering fashions that could make brief predictions. It is a probabilistic classifier, this means that it predicts on the idea of the chance of an object. Some of the common examples of Naïve Bayes Algorithm are junk mail filtration, Sentimental analysis, and classifying articles.
VII. RESULTS
A. Graph on Accuracy
The following mathematical expression is used to calculate the accuracy of each machine learning algorithm so that the best possible result can be used to predict the diseases attained by using ML techniques.
ACCURACY = TP+TN/(P+N) (i)
The following mathematical expression is used to calculate F measure:
F measure = 2TP/(2TP+FP+FN) (ii)
Further, we also calculate G mean to measure quality. Following mathematical expression is used:
G mean = TPRTNR (iii)
Further TPR and TNR values are also calculated:
TRP = TP/(TP+FN) (iv)
TNR = TN/(TN+FP) (v)
The results of CNN model shown in above graphs depicts that one can predict disease by using X-ray images as well as scans of MRI to predict whether the patient is suffering from chronic Disease or not. The train accuracy well described that the model is effective to predict disease using different Data sets.
VIII. FUTURE SCOPE
The main aim of this advanced automated disease analyzer using AI is to predict the disease on the basis of the symptoms or the test report images. This system mainly works using two methods: first, it takes the symptoms of the user from which he or she suffers as input and generates final output as a prediction of disease secondly it takes the input as the test report images such as MRI reports, X rays or the CT scan reports and finally generates output as a prediction of disease. An average prediction accuracy probability of 100% is obtained. This system gives a user-friendly environment and is easy to use. As the system is based on the web application, the user can use this system from anywhere and at any time with ease. In conclusion, for disease risk modelling, the accuracy of risk prediction depends on the diversity feature of the hospital data. If Recent dataset is used the results might alter depending upon various internal and external factors. Further it's seen that the data set might be restricted to one nation or area only.
The main aim of this advanced automated disease analyzer using AI is to predict the disease on the basis of the symptoms or the test report images. This system mainly works using two methods: first, it takes the symptoms of the user from which he or she suffers as input and generates final output as a prediction of disease secondly it takes the input as the test report images such as MRI reports, X rays or the CT scan reports and finally generates output as a prediction of disease. An average prediction accuracy probability of 100% is obtained. This system gives a user-friendly environment and is easy to use. As the system is based on the web application, the user can use this system from anywhere and at any time with ease. In conclusion, for disease risk modelling, the accuracy of risk prediction depends on the diversity feature of the hospital data.
[1] V. Sindhu, S. A. S. Prabha, S. Veni , and M. Hemalatha, “Thoracic surgery analysis using data mining techniques” , International Journal of Computer Technology & Applications , vol. 5, pp 578-586, May, 2014. [2] Ursula Schmidt-Erfurth ,Sebastian M. Waldstein, Sophie Klimscha,Amir Sadeghipour ,Xiaofeng Hu , Bianca S. Gerenda, Aaron Osborne and Hrvoje Bogunovi?, “Prediction of Individual Disease Conversion in Early AMD Using Artificial Intelligence”, Investigative Ophthalmology & Visual Science journal, Vol.59, pp 3199-3208, July 2018 [3] F. Jiang, Y. Jiang, H. Zhi et al., “Artificial intelligence in healthcare: past, present and future,” Stroke and Vascular Neurology, vol. 2, no. 4, pp. 230–243, 2017. [4] Palak Agarwal, Navisha Shetty, Kavita Jhajharia, Gaurav Aggarwal, Neha V Sharma , “Machine Learning For Prognosis of Life Expectancy and Diseases” ,International Journal of Innovative Technology and Exploring Engineering ,Vol. 8, August 2019. [5] Naresh Kumar ,Nripendra Narayan Das,Deepali Gupta ,Kamali Gupta and Jatin Bindra,“Efficient Automated Disease Diagnosis Using Machine Learning Models” , Journal of Healthcare engineering , Vol. 2021, May 2021 [6] Lavesh S, P Sree Lekha, Naveen Kumar R, Manoj T V, Sushila Shidnal , “Lung Cancer Detection and Life Expectancy Post Thoracic Surgery Using CNN and Supervised Machine Learning Algorithms” , International Research Journal of Engineering and Technology , Vol.07 ,Apr 2020. [7] Abeer S. Desuky and Lamiaa M. El Bakrawy ,“Improved Prediction of Post-operative Life Expectancy after Thoracic Surgery” , Advances in Systems Science and Application, Vol.16, Jan 2016. [8] M. Zi?ba, J .M. Tomczak, M. Lubicz, and J. ?wi?tek, “Boosted SVM for extracting rules from imbalanced data in application to prediction of the post-operative life expectancy in the lung cancer patients” , Applied Soft Computing, vol. 14, pp 99-108, Jan. 2014. [9] Md. Ahasan Uddin Harun and Md. Nure Alam, “Predicting Outcome of Thoracic Surgery by Data Mining Techniques”, IJARCSSE, vol. 5, no. 1, pp 7-10, 2015. [10] Mark A. Hall, “Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning”, International Conference on Machine Learning, pp 359-366, 2000. [11] Vishal Gupta, et al., “Performance of Various Feature Selection Techniques under Loaded Networks”, International Journal of Computer Applications, vol. 78, no. 2, September 2013. [12] S. Vijayarani, S. Dhayanand, Liver disease prediction using svm and na¨?ve bayes algorithms, “International Journal of Science, Engineering and Technology Research” Vol. 4, April, 2015. [13] Y. Khourdifi, M. Bahaj, Heart disease prediction and classification using machine learning algorithms optimized by particle swarm optimization and ant colony optimization, “Int. J. Intell. Eng. Syst”, Vol. 12, Oct, 2019. [14] S. Chae, S. Kwon, D. Lee, Predicting infectious disease using deep learning and big data, “International journal of environmental research and public health” , Vol.15, Jul, 2018. [15] S. Mohan, C. Thirumalai, G. Srivastava, “Effective heart disease prediction using hybrid machine learning techniques” , IEEE Access , Vol. Jun (2019). [16] T.V. Sriram, M.V. Rao, G.S. Narayana, D. Kaladhar,T.P.R. Vital, “Intelligent parkinson disease prediction using machine learning algorithms”, International Journal of Engineering and Innovative Technology, Vol. 3, Sep, 2013. [17] Kumar, C. Sunil and R. J. Sree. , “Application of ranking based attribute selection filters to perform automated evaluation of descriptive answers through sequential minimal optimization models”, Ictact Journal on Soft Computing: Special Issue on Distributed Intelligent Systems and Spplications, Vol. 05, Oct, 2014. [18] Jasmina Novakovi´c, Perica Strbac and Dusan Bulatovi´c., “ Toward optimal feature selection using ranking methods and classification algorithms”, Yugoslav Journal of Operations Research, Vol. 21, No. 1, pp.119-135, Mar, 2011. [19] R.D.H.D.P. Sreevalli, K.P.M. Asia, “Prediction of diseases using random forest classification algorithm”, Zeichen Journal, Vol. 6 , Apr 2020.
Copyright © 2022 Tanushka Bansal. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET42925
Publish Date : 2022-05-19
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here