Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Sumit Maurya, Vibhur Garg, Kaushlendra Sharma, Vedant Shukla, Deepika Tyagi
DOI Link: https://doi.org/10.22214/ijraset.2023.50500
Certificate: View Certificate
Symptoms diagnosis is the system based on the prediction method of determining the diseases of the patient based on the symptoms provided by the user. Health is of utmost importance for every living being in this world. As such, we as living beings should do our best to keep ourselves healthy. However, if we suffer from early symptoms, our system analyzes the symptoms provided by the patient and the system will determine what type of disease the person is suffering from. Random Forest is being used as a prediction model to predict the disease. This model is possibly key in increasing the detection rate of diseases at early stages and helps patients take preventive measures at early stages.
I. INTRODUCTION
Machine Learning is the computer programming used for making a system intelligent to take a decision and provide the result according to its experience and data. The system Symptoms Diagnosis is such type of machine learning program, which uses a machine learning algorithm to predict the disease of the patient. According to recently conducted surveys, India’s doctor-to-patient ratio is quite broad and concerning.
As such, this intelligent system can help us overcome this issue. The system helps patients know the disease that might be inhibiting their bodies.
Random Forest is the supervised learning algorithm used that helps efficiently solve complex problems. Random forest algorithm helps in predicting the disease efficiently from the symptoms provided by the user. This system is very beneficial for the health industries as it helps to save the time of doctors as well as the patient. So, this system helps in predicting the disease from the symptoms which when unchecked for a long time can cause fatality also our System help users to find precaution for the diseases and information about the disease. Thus, with the help of machine learning techniques, we created this system.
II. LITERATURE REVIEW
Various studies have been considered for the prediction of various types of diseases. Some of them are given below:
III. PROPOSED SYSTEM
After evaluation of various methods, we used the machine learning algorithm ‘Random Forest’ in the programming language Python and developed the UI in the same environment and named it “Symptoms Diagnosis”. The working of the system is done through the collection of data (such as diseases and their symptoms). The collected data is then cleaned and processed before it is used for further purposes. The data is then divided into two segments; testing and training. Different algorithms were used to predict diseases and we concluded that ‘Random Forest’ fetched us the highest accuracy from other algorithms.
IV. SCOPE OF PROJECT
Disease has gotten great attention in the field of medical science. The diagnosis of disease plays a major role in health industries. If not diagnosed at right time can cause a major problem to the health of individuals. The diagnosis can be done based on symptoms and signs. Our project helps patients to diagnose their disease based on symptoms in their homes. This helps patients to know from which disease he/she may be suffering. The system targets the common people who can’t afford fees for meeting doctors. This System also helps in saving time for individuals and doctors also.
V. METHODOLOGY
A. Dataset Testing And Training
Health industries generate lots of data every year. For building the system, we required some datasets to implement our model. This dataset is useful for making predictions based on historical data using a machine learning algorithm. As such, we used Python’s library ‘Pandas’ to read our CSV files and made them readable for our machine learning algorithm. Pandas is a Python library used to read the dataset it helps to analyze big data. Pandas also help in cleaning and manipulating data to make them readable.
B. Model Building
After importing the dataset using Pandas, we implemented our Machine Learning algorithms such as naïve Bayes, random forest, and ADA boost, to predict the disease based on the user’s input. The System takes user input and based on testing training data predicts the disease.
3. AdaBoost: AdaBoost algorithm was proposed by Robert Schapire and Yoav Freund in 1996. AdaBoost stands for adaptive boosting. It is a boosting technique that combines multiple weak classifiers into a single strong classifier and gave its output. In layman's terms, it means that weak learners are converted into strong learners. The concept of this algorithm is to provide weights to the classifier and train the dataset in each iteration to produce the output of the unusual behavior.
C. Model Deployment
After the completion of dataset training, testing, and implementation of the model, we require an interface through which a user can interact with the system. We used a Python library ‘Tkinter’ to design our interface. Tkinter is a library used in Python to create a graphical user interface simply and easily. Tkinter consists of inbuild models to create the interface of applications.
D. Visual Studio
For the deployment of all the modules we have used visual studio as a tool. Visual Studio is a platform for the code editor, we can edit, debug, and build code. Visual Studio provides many features like completion tools, graphical design, and many more features to enhance the development process. Visual Studio helps to develop any type of application. Thus, the visual studio provides such type of platform where we can do our work and develop an application that can be further published.
E. Weka
Weka is a collection of machine learning algorithms used for solving real-world problems. Weka consists of a visualization tool used for analyzing data. It contains various tasks such as data preprocessing, clustering, etc. Weka provides a visualization tool to inspect the data. Weka helps in the quicker development of the machine learning model. Through the Weka tool, we can measure the performance of the model such as accuracy, confusion matrix, etc.
VI. SYSTEM REQUIREMENT
A. Hardware Requirements
Processor: Any Update Processer
Ram: Min 4GB
Hard Disk: Min 100GB
B. Software Requirements
Operating System: Windows family
Technology: Python3.7
IDE: Jupiter notebook/visual studio
VII. RESULT
Our System “SYMPTOMS DIAGNOSIS” successfully predicts the disease according to the symptoms provided by the user. By comparing and checking the accuracy of different algorithms, we concluded that the random forest algorithm is the best algorithm for prediction with a high accuracy score of 95.13%. The following figure represents the result of the proposed system on WEKA 3.8 in terms of performance such as accuracy.
Now, after evaluating the accuracy of different algorithms, the result is arranged in form of an accuracy table and graph of accuracy. Here, the following accuracy table gives the summary of the accuracy of different algorithms obtained from the instance of the dataset. This graph helps in clear visualization. The Y-axis of the graph has accuracy values and X-axis has names of different algorithms.
In this paper, we have provided a system that analyzes and predict the disease(s) based on the symptoms provided by the patient. The main purpose of the system is to provide early context as to what sort of disease the patient might be inflicted with and potentially save lives in case of early detection of a fatal disease. Also, the system helps the user to get a description of the disease by getting disease information and disease precaution. As for our further goals with the project, we aim to implement this system on a cloud server. By providing this system on a cloud server, we will vastly improve the reachability of this system and help a lot more people in need.
[1] Vincy Cherian and Bindu M.S, “Heart Disease Prediction Using Naïve Bayes Algorithm and Laplace Smoothing Technique”, Journal of International Journal of Computer Science Trends and Technology (IJCST), Vol. 5, No. 2, Mar-Apr 2017. [2] M Preethi and Dr. J Selvakumar, “A Survey of Predicting Heart Disease”, Journal of INTERNATIONAL JOURNAL ON INFORMATICS VISUALIZATION, VOL. 4. NO. 2, 2020 [3] Sonam Nikhar and A.M. Karandikar, “Prediction of Heart Disease Using Machine Learning Algorithms”, Journal of International Journal of Advanced Engineering, Management and Science (IJAEMS), Vol-2, No. 6, June- 2016. [4] Dr. S. Vijayarani, and Mr. S. Dhayanand, “Liver Disease Prediction using SVM and Naïve Bayes Algorithms”, Journal of International Journal of Science, Engineering and Technology Research (IJSETR), Vol. 4, No. 4, April 2015. [5] Mangesh Limbitote, Dnyaneshwari Mahajan, Pushkar Patil Pimpri, and Kedar Damkondwar, “A Survey on Prediction Techniques of Heart Disease using Machine Learning”, Journal of International Journal of Engineering Research & Technology (IJERT), Vol. 9, No. 06, June-2020. [6] Chaimaa Boukhatem, Heba Yahia Youssef, and Ali Bou Nassif, “Heart Disease Prediction Using Machine Learning”, Institute of Electrical and Electronics Engineers (IEEE), march 2022. [7] Sohel Rana, Md. Julker Nayeem, Farjana Alam, and Md. Ataur Rahman, “Prediction of Hepatitis Disease using K-Nearest neighbor, Naïve Bayes, Support Vector Machine, Multi-Layer perceptron, and Random Forest”, Institute of Electrical and Electronics Engineers, April 2021. [8] Muhamad Huzaimi Bin Abdul Ghafar, Nurul Aleena Binti Abdullah, Abdul Hadi Abdul Razak, Megat Syahirul Bin Megat Ali, and Syed Abdul Mutalib Al-Junid, “Chronic Disease Prediction based on data mining method and Support vector machine”, Institute of Electrical and Electronics Engineers, 17 December 2022. [9] Vinayak Singh, Mahendra Kumar Gaourisaria, and Himansu Das, “Performance Analysis of Machine Learning Algorithm for Prediction of Liver Disease”, Institute of Electrical and Electronics Engineers, September 2021.
Copyright © 2023 Sumit Maurya, Vibhur Garg, Kaushlendra Sharma, Vedant Shukla, Deepika Tyagi. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET50500
Publish Date : 2023-04-16
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here