Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Shahid Gulzar, Shorya Dalal, Shivam Toetia, Manoj Kumar Dixit
DOI Link: https://doi.org/10.22214/ijraset.2024.61575
Certificate: View Certificate
: In recent times, there has been a surge of interest in employing machine learning techniques for the early and accurate detection of various diseases. This research introduces a holistic approach to constructing a robust multi-disease detection system that utilizes advanced machine learning algorithms. Our proposed system integrates diverse datasets encompassing a range of medical conditions, enabling the simultaneous detection of multiple diseases within a unified framework. We leverage cutting-edge machine learning models, including, but not limited to, [specify the models used], to analyze and interpret intricate patterns within the data. The methodology involves meticulous feature engineering, model training, and validation on a diverse dataset acquired from [describe the data sources]. The system exhibits outstanding accuracy in discerning between different diseases, underscoring its potential as a versatile diagnostic tool. Furthermore, we delve into the interpretability of the model predictions, offering insights into the decision-making process. Validation results showcase a high level of sensitivity and specificity across various diseases, emphasizing the effectiveness of the proposed approach. The system’s performance is systematically compared with existing methods, revealing superior results in terms of both accuracy and efficiency. This research contributes to the ongoing endeavors in developing advanced healthcare solutions by integrating machine learning for the early detection and diagnosis of multiple diseases. The proposed system shows promise in improving clinical decision-making and enhancing patient outcomes.
I. INTRODUCTION
In the dynamic realm of healthcare, the incorporation of cutting-edge technologies has become indispensable for refining diagnostic accuracy and expediting the identification of various medical conditions. Among these technological strides, machine learning stands out as a formidable tool, holding the promise to revolutionize disease detection. This study embarks on the development of a comprehensive multi-disease detection system, leveraging the prowess of machine learning algorithms to usher in an era of early and precise diagnoses. Traditional approaches to disease detection often rely on disparate diagnostic tools and methodologies tailored for specific medical conditions. However, the evolving intricacies of healthcare necessitate a more unified and efficient approach. This research addresses this challenge by proposing a holistic system adept at detecting multiple diseases within a singular framework. an approach not only streamlines the diagnostic process but also fosters a more comprehensive understanding of a patient’s overall health. The advent of machine learning has unlocked unprecedented opportunities to analyze vast and intricate datasets, offering nuanced insights into disease patterns. Harnessing this potential, our multi-disease detection system amalgamates diverse datasets, covering a spectrum of medical ailments. This inclusion of diverse data enables the model to develop a nuanced understanding of the complexities associated with various diseases, ultimately contributing to heightened diagnostic accuracy. The selection of machine learning models plays a pivotal role in the success of the proposed system. In this study, we employ state-of-the-art algorithms, including [list specific models], known for their efficacy in handling complex medical data. By integrating supervised and unsupervised learning techniques, our system endeavors to uncover hidden patterns and correlations, enabling robust disease detection across different domains. A crucial facet of our approach is the meticulous process of feature engineering. This step involves extracting pertinent information from extensive datasets, facilitating the model in discerning subtle yet crucial indicators of disease. The emphasis on feature engineering underscores our commitment to refining input variables, ensuring that the model is equipped to make informed and accurate predictions. As we navigate through the complexities of developing a multi-disease detection system, interpretability emerges as a paramount concern. Understanding the decisions made by the model is crucial for gaining the trust of healthcare practitioners and ensuring the seamless integration of our system into clinical workflows. This study delves into the interpretability of machine learning models, providing transparency into the decision-making process and fostering a deeper comprehension of diagnostic outcomes.
II. LITERATURE SURVEY
Table 1: Literature Survey Overview
S. |
Title |
Author Name |
Year |
Key points |
|
|
||||||
no |
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
|
||||
1 |
Disease |
Sameer |
2022 |
In this paper, |
|
|
||||||
|
Prediction |
Meshram |
|
he developed a |
|
|
||||||
|
System using |
|
|
diagnosis |
|
|
||||||
|
naïve bayes. |
|
|
system with |
|
|
||||||
|
|
|
|
machine |
|
|
|
|||||
|
|
|
|
learning |
|
|
|
|||||
|
|
|
|
algorithms for |
|
|
||||||
|
|
|
|
the prediction |
|
|
||||||
|
|
|
|
of any disease |
|
|
||||||
|
|
|
|
that can help |
|
|
||||||
|
|
|
|
in a very more |
|
|
||||||
|
|
|
|
accurate |
|
|
|
|||||
|
|
|
|
diagnosis than |
|
|
||||||
|
|
|
|
the traditional |
|
|
||||||
|
|
|
|
method. |
|
|
|
|||||
|
|
|
|
|
|
|
||||||
2 |
An Empirical |
Bilal Khan |
2020 |
In this paper, |
|
|
||||||
|
Evaluation of |
|
|
the |
author |
|
|
|||||
|
Machine |
|
|
employed |
|
|
||||||
|
Learning |
|
|
experiential |
|
|
||||||
|
Techniques |
|
|
analysis |
of |
|
|
|||||
|
for Chronic |
|
|
ML |
|
|
|
|
||||
|
Kidney |
|
|
techniques for |
|
|
||||||
|
Disease |
|
|
classifying the |
|
|
||||||
|
Prophecy. |
|
|
kidney patient |
|
|
||||||
|
|
|
|
dataset |
as |
|
|
|||||
|
|
|
|
CKD |
|
or |
|
|
||||
|
|
|
|
NOTCKD. |
|
|
||||||
|
|
|
|
|
|
|
|
|
||||
3 |
Disease |
Chandrasekhar |
2021 |
The |
|
author |
|
|
||||
|
Prediction |
Rao Jetti, |
|
developed this |
|
|
||||||
|
using Naïve |
|
|
work mainly to |
|
|
||||||
|
Bayes – |
|
|
make |
doctors’ |
|
|
|||||
|
Machine |
|
|
jobs |
easier by |
|
|
|||||
|
Learning |
|
|
using |
|
a |
|
|
||||
|
|
|
machine |
to |
|
|
||||||
|
Algorithm. |
|
|
|
|
|||||||
|
|
|
examine |
a |
|
|
||||||
|
|
|
|
|
|
|||||||
|
|
|
|
patient |
at a |
|
|
|||||
|
|
|
|
basic level and |
|
|
||||||
|
|
|
|
recommend |
|
|
||||||
|
|
|
|
diseases |
that |
|
|
|||||
|
|
|
|
may |
|
be |
|
|
||||
|
|
|
|
present. |
The |
|
|
|||||
|
|
|
|
system |
will |
|
|
|||||
|
|
|
|
show the result |
|
|
||||||
|
|
|
|
based on the |
|
|
||||||
|
|
|
|
available |
|
|
||||||
|
|
|
|
accumulated |
|
|
||||||
|
|
|
|
data. |
|
|
|
|
||||
|
|
|
|
|
|
|
|
|
||||
4 |
Prediction |
Selvaraj |
2021 |
In this paper, |
||||||||
|
Support System |
|
|
the author |
||||||||
|
for Multiple |
|
|
extracted |
||||||||
|
Disease |
|
|
personal |
||||||||
|
Prediction Using |
|
|
data such as |
||||||||
|
Naïve Bayes |
|
|
user health |
||||||||
|
Classifier. |
|
|
conditions |
||||||||
|
|
|
|
from day-to- |
||||||||
|
|
|
|
day life. |
||||||||
|
|
|
|
Further |
||||||||
|
|
|
|
Natural |
||||||||
|
|
|
|
Language |
||||||||
|
|
|
|
was used. |
||||||||
|
|
|
|
|
|
|
|
|
|
|
||||
5 |
Prediction Of |
A Naveen |
2020 |
They have |
|
||||
|
Diabetes Using |
Kishore G,V |
|
Reported |
|
||||
|
Machine Learning |
.Rajesh , |
|
the highest |
|
||||
|
Classification |
A.Vamsi Akki |
|
accuracy as |
|
||||
|
Algorithms. |
Reddy, |
|
74.4 %for |
|
||||
|
|
K.Sumedh,T.raj |
|
the |
|
|
|||
|
|
esh Sai Reddy. |
|
classificatio |
|
||||
|
|
|
|
n algorithm |
|
||||
|
|
|
|
Random |
|
||||
|
|
|
|
Forest and |
|
||||
|
|
|
|
the lowest |
|
||||
|
|
|
|
Accuracy in |
|
||||
|
|
|
|
this work is |
|
||||
|
|
|
|
attained by |
|
||||
|
|
|
|
the KNN |
|
||||
|
|
|
|
reported as |
|
||||
|
|
|
|
71.3%. |
|
||||
|
|
|
|
|
|
||||
6 |
Understanding the |
Gavin Pinto, |
2022 |
The authors |
|
||||
|
Lifestyle of |
Radhika Desai, |
|
used |
Naïve |
|
|||
|
people to identify |
and Sunil |
|
Bayes |
and |
|
|||
|
the reasons of |
Jangid. |
|
SVM |
|
|
|||
|
Diabetes using |
|
|
classificatio |
|
||||
|
data mining. |
|
|
n algorithms |
|
||||
|
|
|
|
on |
the |
|
|||
|
|
|
|
dataset |
|
||||
|
|
|
|
collected by |
|
||||
|
|
|
|
a |
survey |
|
|||
|
|
|
|
using google |
|
||||
|
|
|
|
forms |
and |
|
|||
|
|
|
|
reported the |
|
||||
|
|
|
|
accuracy of |
|
||||
|
|
|
|
64.92 |
for |
|
|||
|
|
|
|
SVM |
and |
|
|||
|
|
|
|
60.44 |
for |
|
|||
|
|
|
|
Naïve |
|
|
|||
|
|
|
|
Bayes. |
|
||||
|
|
|
|
|
|
|
|||
|
7 |
Cardiotocographic |
Miao |
2018 |
The created |
||||
|
|
diagnosis of fetal |
J.H., |
|
model is used |
||||
|
|
health based on |
Miao |
|
to |
||||
|
|
multiclass |
K.H. et |
|
differentiate |
||||
|
|
morphologic |
al. |
|
and categorize |
||||
|
|
pattern |
|
|
the |
||||
|
|
predictions using |
|
|
morphologic |
||||
|
|
deep learning |
|
|
pattern of |
||||
|
|
classification. |
|
|
individuals |
||||
|
|
|
|
|
suffering from |
||||
|
|
|
|
|
pregnancy |
||||
|
|
|
|
|
complications. |
||||
|
|
|
|
|
|
||||
|
8 |
An empirical |
Chhogyal |
2016 |
They have |
||||
|
|
study of a simple |
and |
|
obtained poor |
||||
|
|
naive bayes |
Nayak |
|
accuracy in |
||||
|
|
classifier based on |
|
|
disease |
||||
|
|
ranking functions. |
|
|
prediction |
||||
|
|
|
|
|
also they are |
||||
|
|
|
|
|
not using the |
||||
|
|
|
|
|
standard |
||||
|
|
|
|
|
dataset for |
||||
|
|
|
|
|
training, |
||||
|
|
|
|
|
|
||||
III. OBJECTIVES
IV. PROPOSED WORK
V. METHODOLOGY
Firstly, we meticulously define the problem at the outset of our project, ensuring clarity to construct the necessary and appropriate machine learning models. Subsequently, we gather data from reputable open sources like Kaggle and UCI Machine Learning Repository, recognizing the pivotal role of data quality and quantity in shaping the efficacy of our models. Following data collection, a crucial phase of data preprocessing unfolds, where we meticulously ensure that the gathered data adheres to the correct format. Our attention is directed towards analyzing the data to identify and address issues such as duplicate entries, missing values, and outliers. Visualization techniques are employed to unravel relationships between variables, extracting valuable insights and addressing skewness. In the pursuit of robust and accurate machine learning models, the data is strategically divided into training and testing datasets. Here, 80% of the data is allocated for training purposes, with the remaining 20% earmarked for testing—a pivotal step in fortifying the integrity of our models. This meticulous process not only lays the foundation for accurate predictive modeling but also enables us to draw meaningful conclusions and insights from the data at our disposal.
The project\'s core objective lies in advancing the early prediction of diseases, thereby enhancing patient health outcomes. The focal point revolves around predicting multiple diseases based on an analysis of symptoms. The system, intricately designed for this purpose, accepts patient symptoms as input and generates an output—effectively predicting the potential disease. This predictive model holds promise in not only minimizing the financial burden associated with disease management but also in expediting the recovery process. Through the utilization of this system, patients stand to benefit by reducing treatment costs and saving valuable time. The emphasis on early detection underscores a proactive approach to healthcare, ultimately contributing to improved patient well-being and resource optimization. 1) Competing Interests: Not Applicable. 2) Funding Information: Not Applicable. 3) Author Contribution: a) Conceptualization, methodology, software development. b) Data Collection. c) Project supervision, writing-review & editing. d) Literature review, model evaluation ,manuscript preparation. 4) Data Availability Statement: The datasets utilized in this study are publicly available on google and kaggle platforms. 5) Research Involving Human/Animals: Not Applicable. 6) Informed Consent: Not Applicable. 7) Conflict of Interest Statement: On behalf of all authors, the corresponding author states that there is no conflict of interest.
[1] Sameer Meshram [1], Shital Dongre, Triveni Fole. “Disease Prediction System using naïve bayes”. International Journal for Research in Applied Science & Engineering Technology Volume 10 Issue XII Dec 2022 [2] Bilal khan [2], Rashid Naseem, Fazal Muhammad, Ghulam abbas, and sung hwan kim. “An Empirical Evaluation of Machine Learning Techniques for Chronic Kidney Disease Prophecy”. March 30, 2020 [3] Chandrasekhar Rao Jetti, Rehamatulla Shaik [3], Sadhik Shaik, Sowmya Sanagapalli “Disease Prediction using Naïve Bayes – Machine Learning Algorithm”, December 2021. [4] Prediction Support System for Multiple Disease Prediction Using Naïve Bayes Classifier”. Selvaraj , Mithra MK, Keerthana S, Deepika M. International Journal of Engineering and Techniques – Volume 4 Issue 2, Mar-Apr 2021. [5] Naveen Kishore G, V. Rajesh [5], A. Vamsi Akki Reddy, K. Sumedh, T. Rajesh Sai Reddy, “Prediction Of Diabetes Using Machine Learning Classification Algorithms”. [6] Gavin Pinto, Sunil [6] Jangid, Radhika Desai, Understanding the Lifestyle of people to identify the reasons of Diabetes using data mining”. [7] Miao J.H., Miao K.H. Cardiotocographic diagnosis of fetal health based on multiclass [7] morphologic pattern predictions using deep learning classification. Int. J. Adv. ComputerScienceAppl.2018;9:1–11. Doi:10.14569/IJACSA.2018.090501. [CrossRef] [Google Scholar]. [8] M.Marimuthu [8], S. Deivarani, R. Gayatri, “Analysis of Heart Disease Prediction using Machine Learning Techniques”. [9] Purushottam [9], Richa Sharma, Dr. Kanak Saxena,” Efficient Heart Disease Prediction System”. [10] Sriram T.V. [10], Rao M.V., Narayana G.S., Kaladhar D., Vital T.P.R. Intelligent Parkinson disease prediction using machine learning algorithms. Int. J. Eng. Innov. Technol. (IJEIT) 2013; 3:1568–1572. [Google Scholar]. [11] Amandeep Kaur, Jyothi Arora,” Heart Disease Prediction using data mining Techniques: A survey” [11]. [12] Noreen Fatima, Li Liu, Sha Hong, Haroon Ahmed,” Prediction of Breast Cancer, Comparative Review Of Machine Learning Algorithms and their analysis” [12]. [13] Ch. Shravya, K. Pravallika, Shaik Subhani,” Prediction of Cancer using supervised machine learning Algorithms” [14] Nikita Rane, Jean Sunny, Rucha Kanade, Sulochana Devi,” Breast Cancer classification and prediction using machine learning “[14]. [15] Kandhasamy J.P., Balamurali S. Performance analysis of classifier models to predict diabetes mellitus. Procedia Computer Sci. 2015; 47:45–51. Doi: 10.1016/j.procs.2015.03.182. [CrossRef] [Google Scholar][15]. [16] Yahyaoui A., Jamil A., Rasheed J., Yesiltepe M. A decision support system for diabetes prediction using machine learning and deep learning techniques; Proceedings of the 2019 1st International Informatics and Software Engineering Conference (UBMYK); Ankara, Turkey. 6–7 November 2019; pp. 1–4. [Google Scholar][16]. [17] Dai X., Spasi? I., Meyer B., Chapman S., Andres F. Machine learning on mobile: An on-device inference app for skin cancer detection; Proceedings of the 2019 Fourth International Conference on Fog and Mobile Edge Computing (FMEC); Rome, Italy. 10–13 June 2019; Manhattan, NY, USA: IEEE; 2019. Pp. 301–305. [Google Scholar][17]. [18] Daghrir J., Tlig L., Bouchouicha M., Sayadi M. Melanoma skin cancer detection using deep learning and classical machine learning techniques: A hybridapproach; Proceedings of the2020 5th International Conference on Advanced Technologies for Signal and Image Processing (ATSIP); Sfax, Tunisia. 2–5 September 2020; Manhattan, NY, USA: IEEE; 2020. Pp. 1–5. [Google Scholar][18]. [19] Vidushi A.R., Shrivastava A.K. Diagnosis of Alzheimer disease using machine learning approaches. Int. J. Adv. Sci. Technol. 2019; 29:7062–7073. [Google Scholar][19]. [20] Hemdan E.E.D., Shouman M.A., Karar M.E. Covidx- net: A framework of deep learning classifiers to diagnose COVID-19 in X-ray images. arXiv. 20202003.11055 [Google Scholar][20]. [21] Sultana Z., Khan M.R., Jahan N. Early breast cancer detection utilizing artificial neural network. WSEAS Trans. Biol. Biomed. 2021; 18:32–42. Doi: 10.37394/23208.2021.18.4. [CrossRef] [Google Scholar][21]. [22] Mohammed S.A., Darrab S., Noaman S.A., Saake G. Analysis of breast cancer detection using different machine learning techniques; Proceedings of the International Conference on Data Mining and Big Data; Belgrade, Serbia. 14–20 July 2020; Berlin/Heidelberg, Gemany: Springer; 2020. Pp. 108–117. [Google Scholar][22]. [23] Rubin J., Abreu R., Ganguli A., Nelaturi S., Matei I., Sricharan K. Recognizing abnormal heart sounds using deep learning. arXiv. 20171707.04642 [Google Scholar][23]. [24] Pingale, K., Surwase, S., Kulkarni, V., Sarage, S. and Karve, A., “Disease prediction using machine learning”, International Research Journal of Engineering and Technology (IRJET), Vol. 6, (2019), 831-833. Doi: 10.1126/science.1065467[24]. [25] Cao, J., Wang, M., Li, Y. and Zhang, Q., “Improved support vector machine classification algorithm based on adaptive feature weight updating in the Hadoop cluster environment”, PloS One, Vol. 14, No. 4, (2019), e0215136[26].https://doi.org/10.1371/journal.pone.021 5136[25]. [26] Chhogyal, K. and Nayak, A., \"An empirical study of a simple naive bayes classifier based on ranking functions\", in AI 2016: Advances in Artificial Intelligence: 29th Australasian Joint Conference, Hobart, TAS, Australia, December 5-8, 2016, Proceedings 29, Springer., (2016), 324-331[26]. [27] Kumar, A., Bharti, R., Gupta, D. and Saha, A.K., \"Improvement in boosting method by using rustboost technique for class imbalanced data\", in Recent Developments in Machine Learning and Data Analytics: IC3 2018, Springer., (2019), 51-66[27].
Copyright © 2024 Shahid Gulzar, Shorya Dalal, Shivam Toetia, Manoj Kumar Dixit. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET61575
Publish Date : 2024-05-04
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here