Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Prof. M. S. Namose, Payal Gaikwad , Isha Makhija , Pratik Shelke , Gaurav Solankar , Shlok Salvi
DOI Link: https://doi.org/10.22214/ijraset.2023.56612
Certificate: View Certificate
In this examine, we awareness on cardiovascular disease, a major worldwide motive of mortality. Researchers use gadget getting to know and records evaluation strategies to enhance the prognosis of this ailment. We introduce a brand new version, the Quine McCluskey Binary Classifier (QMBC), which combines seven extraordinary fashions to efficiently become aware of patients with coronary heart disease. To decorate performance, we appoint feature selection and extraction methods.First, we discover the top 10 relevant features from the dataset the use of Chi-rectangular and ANOVA approaches. We then lessen the dimensionality of the facts with principal aspect analysis, retaining nine essential additives. The QMBC version combines the outputs of the seven fashions to create a truthful rule for predicting coronary heart ailment. The outcomes from the seven fashions are dealt with as unbiased functions, while the target attribute depends on those results. Our proposed QMBC version outperforms present methods, establishing its effectiveness in heart disorder prediction.
I. INTRODUCTION
Heart disease encompasses diverse disorders affecting the coronary heart and blood vessels, resulting in hundreds of thousands of deaths yearly. analysis historically is based on scientific history and tests. gadget gaining knowledge of (ML) has become vital for early disease detection, presenting a way to become aware of styles in information. Hybrid fashions integrate function choice/extraction and classifiers, improving accuracy. heart sickness datasets regularly contain inappropriate or redundant features that lessen system overall performance, requiring dimensionality reduction strategies. Ensemble methods, consisting of combining predictions from a couple of classifiers, have tested effective. This study proposes an ML model that predicts heart disease diagnosis the usage of LR, DT, RF, KNN, NB, SVC, and MLP fashions. characteristic choice/extraction strategies (Chi-square, ANOVA, and PCA) are used to optimize model efficiency. The Quine McCluskey Binary Classifier (QMBC), an ensemble approach, combines individual version outputs to make predictions. The effectiveness of this method is evaluated on three datasets and as compared to existing techniques. Preprocessed statistics and optimized computation time using FS and FE techniques (Chi-square, ANOVA). Applied PCA FE technique to extract prime additives.,introduced QMBC ensemble method for heart disease prediction. Evaluated QMBC on benchmark datasets, outperforming current models in accuracy, precision, recall, specificity, and f1-rating.
In addition to the aforementioned improvements in heart disease prediction using the Quine McCluskey Binary Classifier (QMBC), this study also investigates the practical implications and potential clinical applications of the developed model. By achieving higher accuracy, precision, recall, specificity, and F1-score, the QMBC model presents an opportunity for early and accurate heart disease diagnosis. The impact of such a predictive model in the medical field extends beyond research, as it can be integrated into clinical practice for risk assessment and timely intervention. Moreover, the study delves into the interpretability of the QMBC model, shedding light on the key features and factors contributing to heart disease prediction. This interpretability can empower healthcare practitioners with valuable insights, enabling them to better understand the factors driving diagnoses and facilitating more personalized patient care. By enhancing transparency and interpretability, the QMBC model can bridge the gap between data-driven insights and medical decision-making.
II. OBJECTIVES
III. LITERATURE SURVEY
In our manuscript, we introduce a novel method for heart anomaly detection from ECG data. Our approach comprises signal pre-processing, feature extraction, model training, and calibration. We use 110 features to train five models on three datasets, achieving strong performance and generalizability across different conditions and patient characteristics. We can detect various heart abnormalities, calibrate our models for reliability, and provide real-time predictions, making them a valuable tool for patient care.[1]
2. Disadvantages
A. Algorithm used:- XGBoost Algorithm
This study predicts heart failure using machine learning on a dataset of 1025 patient records. A new PCHF feature selection method enhances performance by choosing the top eight features. Various machine learning techniques were compared, and a decision tree achieved 100% accuracy with a very short runtime of 0.005 seconds. Cross-validation confirmed the model's performance, showing that this method outperforms existing studies and can be applied broadly for heart failure detection.[2]
1. Advantage
2. Disadvantages
B. Algorithm used:- cross-validation technique
Deep learning is highly effective for heart disease diagnosis and prediction, outperforming other methods. We plan to improve our approach by incorporating images data from patient exams and applying Convolutional Neural Networks (CNN) for automatic feature detection. We will also use performance metrics like confusion matrix and PR/ROC curves for evaluation. Additionally, we'll explore combining structured and unstructured data to enhance the CNN model's accuracy in predicting heart disease.[3]
2. Disadvantages
C. Algorithm used:- Keras-based deep learning model to compute results with a dense neural network.
This paper develops a hybrid intelligent machine learning approach for predicting mortality during follow-up in heart disease cases. Various algorithms are tested, with Random Forest Classifier and Decision Tree Classifier showing high accuracy (100% and 99.76%) for specific datasets when used with feature selection (SFS). The study emphasizes the importance of feature selection in enhancing accuracy and reducing computation time. It aims to create a framework for predicting disease occurrences, improving heart disease analysis, and enhancing decision support systems.[4]
2. Disadvantages
D. Algorithm used:- LDA, RF, GBC, DT, SVM, and KNN
The study introduces a new feature selection algorithm, IHDSSO, which, when combined with a random forest classifier, achieves over 98.38% accuracy in predicting Ischemic heart disease using key features such as 'Cp,' 'restecg,' 'old peak,' 'Ca,' and 'thal.' It has promising potential for healthcare applications, but there's room for improvement in convergence accuracy and speed.[5]
2. Disadvantages
E. Algorithm used:- Random Forest Classifier
The research focuses on improving the accuracy of heart disease prediction using machine learning. It uses the Relief feature selection algorithm and a large dataset, achieving a 99.05% accuracy with 10 features. Future goals include generalizing the model, making it robust against missing data, and exploring Deep Learning algorithms for application. The primary aim is to create an easily implementable model for practical settings.[6]
2. Disadvantages
F. Algorihtm used:- Decision Tree Bagging Method (DTBM), Random Forest Bagging Method (RFBM), K-Nearest Neighbors Bagging Method (KNNBM), AdaBoost Boosting Method (ABBM), and Gradient Boosting Boosting Method (GBBM)
This study explores early-stage risk prediction of non-communicable diseases (NCDs) through wearable technology in healthcare. We reduce ML pre-processing by using verified medical training data. We introduce a novel method for creating dynamic test datasets from IoT sensor data. This enables machine learning algorithms to achieve over 94% accuracy. The framework is particularly effective for predicting diabetes and can be extended to other NCDs like stroke or thyroid with proper epidemiological data.[7]
1. Advantages
2. Disadvantages
G. Algorithm used:- Random Forest
The passage discusses the need for electronic health records (EHRs) to improve patient care, emphasizes the importance of data sharing and integration, and introduces a privacy-preserving model for predicting heart disease from distributed patient data.[8]
2. Disadvantages
H. Algorihtm used :- Naïve Bayes classification
The study assesses machine learning models with three datasets, and a novel Quine McCluskey Binary Classifier (QMBC) outperforms existing methods. QMBC, with Anova and PCA FE, achieves high accuracy, precision, recall, and f1-scores for all datasets. Future work includes addressing imbalanced datasets and exploring deep learning methods for heart disease prediction.[9]
2. Disadvantages
I. Algorihtm used:- Quine McCluskey Binary Classifier (QMBC)
The study presents the MaLCaDD framework for early cardiovascular disease prediction. It uses K-NN due to its simplicity, handles non-normally distributed data with non-parametric tests, and maintains manageable computational complexity. The framework consists of four phases, achieving high accuracy with reduced features.[10]
2. Disadvantages
J. The algorithm used:- Logistic Regression and K-Nearest Neighbor (KNN) classifiers
???????III. IMPLEMENTATION DETAILS OF MODULE
The QMBC (Quantum-Inspired Multi-Objective Binary Crow Search Algorithm) is an optimization algorithm inspired by quantum computing principles and the crow search algorithm. It is primarily designed for solving multi-objective optimization problems. While it's not a conventional choice for heart disease prediction, you can potentially use it as part of a feature selection or parameter optimization process in a machine learning model for heart disease prediction.
Here is a high-level overview of the system architecture for predicting heart disease using the QMBC algorithm:
2. Data Preprocessing
3. Feature Selection
4. Data Splitting
5. Model Building
6. Hyperparameter Optimization
7. Model Training
8. Model Evaluation
9. Model Testing
10. Deployment
11. Continuous Monitoring and Maintenance
It's important to note that the QMBC algorithm is typically used for optimization tasks, and its application to feature selection and hyperparameter tuning can be an unconventional approach for medical predictive modeling. Traditional machine learning algorithms and techniques, along with a solid understanding of medical data and domain knowledge, should also be integrated into the system architecture to ensure robust and reliable heart disease prediction. Additionally, ensure that you comply with all relevant privacy and regulatory requirements when working with medical data.
A. Algorithm
IV. RESULT
After doing a little statistics preparation, the researchers used the preprocessed information to teach and check seven distinctive system studying models on three one of a kind datasets: Cleveland, CVD, and HD (comprehensive), with an 80:20 ratio, because of this 80% of the records became used for training and 20% for testing. They then provided the consequences in tables and bar charts.
For the Cleveland dataset, when they did not practice function selection or engineering (FS & FE), the Logistic Regression (LR) model had the very best accuracy at 86.88%, and it additionally had the very best precision at 82.92%. then again, the Naive
Bayes (NB) version executed the very best recollect at 100%. however, a brand new technique they proposed called QMBC performed the best in phrases of specificity and f1-rating, attaining 78.59% and 78.68%, respectively.
For the CVD dataset, the QMBC approach completed an accuracy of 80.fifty nine%, precision of ninety six.19%, bear in mind of 63.95%, specificity of 97.44%, and an f1-score of 76.82%.
For the HD dataset (complete), the QMBC technique performed an accuracy of 91.52%, precision of a hundred%, consider of 84.44%, specificity of 100%, and an f1-score of 91.59%.
To summarize, the QMBC technique outperformed other fashions in terms of particular overall performance metrics for these datasets, such as accuracy, precision, keep in mind, specificity, and f1-rating. In easy terms, it did properly in efficiently identifying instances of interest and in supplying accurate predictions for the given datasets.
1) Objective: The study aims to assess the performance of seven standalone machine learning models and a voting classifier on three datasets: the Cleveland dataset, cardiovascular dataset, and HD dataset (Comprehensive). 2) Data Preprocessing: All datasets are preprocessed to make them suitable for machine learning models. This preprocessing involves reducing dimensions, improving computational speed, and eliminating irrelevant and duplicate features from the data. 3) Techniques Used: The study employs Chi-Square and Anova techniques for dimension reduction and Principal Component Analysis Feature Extraction (PCA FE) to optimize the data. 4) Novel Ensemble Model - QMBC: The study introduces a novel Quine McCluskey Binary Classifier (QMBC) that combines the predictions of seven standalone machine learning models to predict the presence of heart disease. 5) Outstanding Performance: The QMBC model, particularly when fused with Anova and PCA FE techniques, outperforms existing models and methodologies. It achieves high accuracy, precision, recall, and f1-score on the Cleveland dataset, cardiovascular dataset, and HD dataset (Comprehensive). 6) Results: For the Cleveland dataset, the QMBC model achieves an accuracy of 98.36%, precision of 100%, recall of 97.22%, specificity of 100%, and an f1-score of 98.59%. On the CVD dataset, it reaches an accuracy of 99.95%, precision of 100%, recall of 99.91%, specificity of 99.98%, and an f1-score of 99.95%. On the HD dataset (Comprehensive), it attains an accuracy of 98.31%, precision of 96.89%, recall of 100%, specificity of 97.96%, and an f1-score of 98.42%. 7) Future Work: The authors plan to address imbalanced datasets and explore deep learning approaches for predicting heart disease, with the ultimate goal of saving lives. In summary, the study demonstrates the superior performance of the QMBC model when combined with specific techniques, achieving excellent results in heart disease prediction across various datasets, and outlines future research directions.
[1] Dimitris Bertsimas, Luca Mingardi, and Bartolomeo Stellato, Member, 2021 “IEEE.,Machine Learning for Real-Time Heart Disease Prediction,IEEE”, vol. 8, pp. 133034–133050. [2] AZAM MEHMOOD QADRI,ALI RAZA ,KASHIF MUNIR AND MUBARAK S. ALMUTAIR, “Effective Feature Engineering Technique for Heart Disease Prediction With Machine Learning, IEEE 2023”, vol. 8, pp. 184087– 184108. [3] ABDULWAHAB ALI ALMAZROI EMAN A. ALDHAHRI .SABA BASHIR AND SUFYAN ASHFA “A Clinical Decision Support System for Heart Disease Prediction Using Deep Learning, IEEE 2023”, J. Pers. Med., vol. 12, no. 8, p. 1208 [4] GHULAB NABI AHMAD,SHAFIULLAH “Comparative Study of Optimum Medical Diagnosis of Human Heart Disease Using Machine Learning Technique With and Without Sequential Feature Selection,IEEE 2022”, IEEE Access, vol. 9, pp. 106575–106588 [5] D. CENITTA ,R. VIJAYA ARJUNAN K. V. PREMA “Ischemic Heart Disease Prediction Using Optimized Squirrel Search Feature Selection Algorithm, IEEE 2022”, . Stat. Softw., vol. 36, no. 11, pp. 1–13 [6] PRONAB GHOSH,SAMI AZAM , MIRJAM JONKMAN (Member, IEEE),ASIF KARIM “Efficient Prediction of Cardiovascular Disease ,Using Machine Learning Algorithms With Relief and LASSO Feature Selection Techniques,IEEE 2021”, Conf. Inventive Comput. Technol. (ICICT), pp. 1329–1333 [7] RAHATARA FERDOUSI , M. ANWAR HOSSAIN , (Senior Member, IEEE),AND ABDULMOTALEB EL SADDIK , (Fellow, IEEE) “Early-Stage Risk Prediction of Non-Communicable Disease Using Machine Learning in Health CPS,IEEE 2021”, Biomed. Signal Process. Control, Art. no. 103318. [8] AHMED M. KHEDR , ZAHER AL AGHBARI , (Senior Member, IEEE), AMAL AL ALI AND MARIAM ELJAMIL Department of Computer Science, University ,“An Efficient Association Rule Mining FromDistributed Medical Databases forPredicting Heart Diseases,IEEE 2021”, Comput. Biol. Med., Art. no. 105624. [9] RAMDAS KAPILA , THIRUMALAISAMY RAGUNATHAN, (Member, IEEE), SUMALATHA SALETI , T. JAYA LAKSHMI , (Member, IEEE), AND MOHD WAZIH AHMAD,Heart, “Disease Prediction Using Novel Quine McCluskey Binary Classifier (QMBC),IEEE 2023”, Appl. Sci., vol. 13, no. 1, p. 118. [10] RAHIM ,YAWAR RASHEED ,FAROOQUE AZAM ,MUHAMMAD WASEEM ANWAR ,“An Integrated Machine Learning Framework for Effective Prediction of Cardiovascular Diseases,IEEE 2021AQSA”, Expert Syst. Appl., vol. 207, Nov. 2022, Art. no. 117882 [11] World Health Organization. (2009). Cardiovascular Diseases (CVDS). [Online]. Available: http://www.who.int/mediacentre/factsheets/fs317/en/ index.html. [12] M. Ozcan and S. Peker, ‘‘A classification and regression tree algorithm for heart disease modeling and prediction,’’ Healthcare Anal., vol. 3, Nov. 2023, Art. no. 100130.
Copyright © 2023 Prof. M. S. Namose, Payal Gaikwad , Isha Makhija , Pratik Shelke , Gaurav Solankar , Shlok Salvi. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET56612
Publish Date : 2023-11-10
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here