Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Mohesh T, Gowtham K, Vijeesh P, Arun Kumar S
DOI Link: https://doi.org/10.22214/ijraset.2022.44075
Certificate: View Certificate
Diagnosis of Parkinson\'s disease (PD) is commonly based on medical observations and assessment of clinical signs, including the characterization of a variety of motor symptoms. However, traditional diagnostic approaches may suffer from subjectivity as they rely on the evaluation of movements that are sometimes subtle to human eyes and therefore difficult to classify, leading to possible misclassification. In the meantime, early non-motor symptoms of PD may be mild and can be caused by many other conditions. Therefore, these symptoms are often overlooked, making diagnosis of PD at an early stage challenging. To address these difficulties and to refine the diagnosis and assessment procedures of PD, machine learning methods have been implemented for the classification of PD and healthy controls or patients with similar clinical presentations.
I. INTRODUCTION
Parkinson’s Disease creates neural system disorder for various people. The disease affects the people at different age groups around the world. Medical research works collaborate with computational intelligence techniques for predicting Parkinson symptoms. PD has numerous types based on the human abnormalities.
Mostly it disturbs the nature of neural activities and the body movements. Researches evolved in recent years use Machine Learning (ML) and Deep Learning (DL) approaches for finding early stages of PD. The research works used different types of medical observations such as voice levels, handwriting variations, body movements, brain signal variations and protein aggregations. These kinds of observations are measured using various medical apparatuses.
A. Objective
The main objective of this article is to recognize what is Parkinson’s sickness and to discover the early onset of the disorder. We will use here XGBoost, Random wooded area algorithm, guide Vector Machines (SVMs), and other gadget learning Algorithms make use of the data-set available on UCL Parkinson Data-set.
https://archieve.ics.uci.edu/ml/datasets/Parkinsonpercent27s+sickness+class.
B. Motivation Of The Project
Parkinson illnesses are the most vital reasons of dying and disability worldwide. According to the Parkinson sickness foundation, The affected peoples within the global of Parkinson ailment is projected that the 1 million humans are living by way of 2020 in united states of America. The scientific treatment of Parkinson disorder may be recommended on Neuropathologic and Histopathologic. Medical diagnostic detection of Parkinson sickness can be completed on enormous choice basing on the sensitivity and specificity of the feature Parkinson disease capabilities consequently, the Parkinson sickness are needed to explore the medical, pathologic, and nosology research grounded on frequency of prevalence, traits, and including hazard elements of samples. Parkinson typically affects a huge part of global sufferers over the age of fifty, which has affected to date still now there's no recognised cause of Parkinson disease, however, it is very likely possible to soothe signs and symptoms knowingly within the early stage of the subjective patients.
A have a look at claimed that around ninety% of the sufferers affected with vocal damage. The Parkinson remedy is in all likelihood very expensive.
This reasons most of the sufferers cannot have the funds for the value of the Parkinson sickness. because if the ailment is detected in the preliminary degree, then the cost will lower and it will additionally be viable to store the patient’s lifestyles these days, Parkinson disorder prediction is maximum crucial rely for clinical practitioners to take correct decision of such ailment. It’s a exquisite exercise at gift time, gadget learning based totally giant platform can detect Parkinson ailment.
C. Scope Of The Project
In recent, machine learning knowledge of algorithms have generated a great impact and commitment in the Parkinson studies network for detection of Parkinson disorder. furthermore, system getting to know techniques are distinctive greater precise outcomes in disease prediction in comparison to others facts taxonomy techniques. prompted by way of this, the authors have used three distinguished device studying algorithms for detection and right diagnosis of Parkinson sufferers. the main purpose of this task is to look at the overall performance size of various distinguished class methods for this assignment a couple of machine studying techniques were used which includes support Vector device, Logistics Regression and different system getting to know algorithms. moreover, the overall performance of the 3 classifiers become evaluated the use of exceptional methods.
By using machine learning techniques, the hassle can be solved with minimum mistakes price. The voice dataset of Parkinson's ailment from the UCI device mastering library is used as input. additionally our proposed device presents correct results with the aid of integrating spiral drawing inputs of regular and Parkinson’s affected sufferers. We endorse a hybrid and accurate results reading affected person both voice and spiral drawing data’s. for that reason combining each the effects, the medical doctor can finish normality or abnormality and prescribe the medicine based totally at the affected stage.
E. Overview of The Project
Parkinson's disease is a revolutionary nervous system sickness that impacts movement. signs and symptoms begin progressively, now and again starting with slightly substantive tremor in only one hand. Tremors are not unusual, however he disorder additionally generally causes stiffness or slowing of movement. within the early degrees of Parkinson's ailment, your face may display very little expression. Your palms may not swing while you stroll. Your speech may additionally turn out to be gentle or slurred. Parkinson's sickness signs worsen as your condition progresses through the years. Although Parkinson's disorder can't be cured, medicines would possibly appreciably enhance your signs and symptoms. occasionally, your doctor may additionally recommend surgery to regulate certain regions of your brain and enhance your signs and symptoms.
F. Problem Definition
Parkinson's disorder (PD) is a progressive sickness with a presymptomatic interval; this is, there is a length all through which the pathologic technique has started, but motor symptoms required for the clinical prognosis are absent. in spite of fascinating leads, direct testing of preclinical markers has been constrained, in particular because there is no reliable manner to identify preclinical disorder. Idiopathic RBD is characterized via loss of regular atonia with REM sleep. approximately 50% of affected individuals will increase PD or dementia inside 10 years.
II. LITERATURE REVIEW
Dr. Anupam Bhatia and Raunak Sulekh (2019), “Predictive Model for Parkinson’s Disease through Naive Bayes Classification” In this study, Naive Bayes was applied to predict the performance of the dataset. Rapid miner 7.6.001 is a tool, which was used to explore, statistically analyze, and mine the data. The Naive Bayes model performs with 98.5 % accuracy, and 99.75% of precision.
Carlo Ricciardi, et al (2019), “the use of gait analysis’ parameters to categorise Parkinsonism: A information mining method” in this gadget, Random woodland is used for class in conjunction with comparing it with Gradient Boosted trees. these consequences are being categorized into three special categories specifically PSP, De Novo Parkinson’s disorder and stable Parkinson’s disease with their accuracy being as high as 86.4% in comparison to Gradient Boosted trees which have been correct to a meagre 70%. additionally the precision price of Random forest changed into most of ninety % against Gradient Boosted trees which were around most of 85%.
Researches have been carried out on PD detection strategies the usage of medical and computational biology strategies. Sadek et al (2019) designed and carried out artificial Neural network (ANN) and lower back Propagation (BP) techniques for predicting PD signs and symptoms. ANN and BP techniques have been used for figuring out continuous patient actions. The paintings used ANN primarily based pattern matching techniques. The patterns of various affected person movements have been registered and trained for most reliable disorder evaluation. The paintings introduced ideal PD analysis results on homogeneous scientific functions on the identical time, this paintings were given limitations in multi-variation characteristic evaluation.
Gao et al. (2019) proposed a specific prediction and classification technique for Parkinson data analysis. This work was implemented with the help of data preprocessing techniques, cross fold validations, and ML approaches. The neuro data features and the tremor data features were analyzed to predict the symptoms. The methods used in this work provided valuable results but lacked in PD detection and sensitivity rate. Many research works were inherited with different perspectives for predicting PD.
In this manner, Mostafa et al. (2020) determined the answer the use of multi-agent facts analysis device. The multi-agent device turned into designed for evaluating vocal issues. affected person’s vocal statistics were diagnosed as primary functions for disease detection. The vocal variations were analyzed with the assist of Reinforcement gaining knowledge of (RL), choice Tree (DT), Naïve Bayes category and Random forest (RF) techniques. This paintings gathered the medical facts from Tel Aviv Sourasky medical middle however, the paintings became lagging with real-time problems. Biswajt et al. (2020) proposed the system to locate D symptoms the usage of speech attributes. on this work, SVM and Naïve Byes strategies have been mentioned the use of voice information features. the radical paintings built ML primarily based voice evaluation models for locating the PD symptoms. The work tried to task extra correct outcomes. however, this work lacked with the constrained dataset features.
Rastegari et al (2021) evolved data gain evaluation model for detecting PD capabilities from the given dataset. This technique used diverse ML and data gain approached in mixed manner. This method labored well in PD findings. however they produced insignificant effects as compared to DL primarily based PD evaluation models. Seppi et al. (2021) analyzed and carried out Parkinson treatment evaluation for various non-motor signs. This work cautioned that the remedy analysis allows to improve and update the future subsequent level treatments. on this regard, the paintings gathered the evidences of numerous treatments and produced valuable suggestions. Many systems created clinical review reviews on PD and remedies however those structures were no longer equipped with own specific techniques.
Espay et al. (2022) proposed new strategies for comparing Synuclein protein disorders and aggregation price for the detection of Parkinson and Alzheimer signs. specifically, the protein aggregation techniques used for sporadic Parkinson and Alzheimer. At that point, the gadget did not take DL and ML help to educate the protein capabilities. locating one of a kind styles of PD and predicting the signs and symptoms were virtually complicated tasks.
Bot et al. (2022) used cell datasets and mobile primarily based statistics collections for PD evaluation. on this work, the techniques have been educated to supply less overhead statistics functions on the detection of Parkinson. The techniques were correctly operated but restricted with scalability issues.
Bouwmans et al. (2022) passed through with Idiopathic PD (IPD) symptoms for assorted feature class Ramzi M. Sadek et al (2022), “Parkinson’s disease Prediction the use of artificial Neural network” on this machine, 195 samples in the dataset had been divided into one hundred seventy training samples and 25 validating samples. Then importing the dataset inside the just Neural network (JNN) environment, we skilled, established the synthetic Neural community model. The maximum crucial attributes contributing to the ANN model were made known of. The ANN version changed into 90% correct
Satish Srinivasan, Michael Martin & Abhishek Tripathi (2020), “ANN based totally information Mining analysis of Parkinson’s disorder” on this study, it was supposed to understand how the exceptional types of preprocessing steps should affect the prediction accuracy of the classifier. in the method of classifying the Parkinson’s disorder dataset the use of the ANN based MLP classifier a appreciably high prediction accuracy changed into discovered while the dataset turned into pre-processed the usage of each the Discretization and Resample approach, both inside the case of 10-fold go validation and 80:20 split.
III. SOFTWARE REQUIRMENTS SPECIFICATION
A. Python
B. Libraries
C. Operating System
D. Hardware Requirements Specification
E. Inputs
IV. SYSTEM ANALYSIS
A. System Review
This survey is executed to recognize the need and prerequisite of the general population who has PD, and to do as such, we went through exceptional sites and programs and searched for the essential statistics. based on these information, we made an audit that helped us get new thoughts and make distinctive arrangements for our challenge. We reached the choice that there's a need of such utility and felt that there's a respectable extent of development on this field too.
B. Technology Used
V. REQUIREMENT ANALYSIS
A. Python
C. OS
Program is tested on Windows 10 build 1903 and PopOS 19.04
D. Classification Techniques
E. Laptop
Used to run our code.
VI. SYSTEM DESIGN
A. Architectural Diagram
VII. IMPLEMENTATION
A. Parkinson's Disease Prediction
C. See run_model.py
TO RUN
D. Generating Weights
Name |
Description |
run_model.py |
Run Model from pre-trained weights |
best_params.py |
Get Models for best hyperparameters |
generate_weights.py |
Save Weights for best models |
pre_processing.py |
PCA, Feature Selection, train-test split |
EDA.py |
Exploratory Data Analysis on dataset |
weights_original |
Weights from the Original Project |
VIII. PROGRAM AND SCREENSHOTS OF PROJECT
A. best_params.py
from sklearn.naive_bayes import BernoulliNB |
||
from sklearn.linear_model import LogisticRegression |
||
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier |
||
from sklearn.tree import DecisionTreeClassifier |
||
from xgboost import XGBClassifier |
||
from sklearn.svm import SVC |
||
from sklearn.ensemble import AdaBoostClassifier |
||
from sklearn.model_selection import GridSearchCV |
||
from pre_processing import X_train, X_test, y_test, y_train |
||
from sklearn.neural_network import MLPClassifier |
||
# best_params file returns the best_models tuned to best performing parameters |
||
best_models = {1: "naive_bayes", 2: "logistic_regression", 3: "decision_tree", |
||
4: "svm", 5: "random_forest", 6: "ada_boost", 7: "gradient_boost", 8: "xg_boost", 9: "neural_net"} |
||
def best_model(model_num): |
||
# Naive Bayes |
||
if model_num == 1: |
||
clf = BernoulliNB() |
||
clf.fit(X_train, y_train) |
||
return clf |
||
# Logistic Regression |
||
if model_num == 2: |
||
param_grid = {'penalty': ['l1', 'l2'], |
||
'C': [0.001, 0.01, 0.1, 1, 10, 100, 1000]} |
||
grid = GridSearchCV(LogisticRegression(random_state=0), |
||
param_grid, refit=True) |
||
grid.fit(X_train, y_train) |
||
return grid |
||
# decision_tree |
||
if model_num == 3: |
||
param_grid = {'criterion': ['gini', "entropy"], |
||
'splitter': ["best", "random"], |
||
'max_depth': range(1, 20), |
||
'min_samples_split': range(2, 10, 1)} |
||
grid = GridSearchCV(DecisionTreeClassifier(random_state=0), |
||
param_grid, refit=True) |
||
grid.fit(X_train, y_train) |
||
|
return grid |
|
|
# SVM |
|
|
if model_num == 4: |
|
|
param_grid = {'C': [0.1, 1, 10, 100, 1000], |
|
|
'gamma': [1, 0.1, 0.01, 0.001], |
|
|
'kernel': ['rbf', 'poly', 'sigmoid'] |
|
|
} |
|
|
grid = GridSearchCV(SVC(random_state=0), param_grid, refit=True) |
|
|
grid.fit(X_train, y_train) |
|
|
return grid |
|
|
# Random Forest |
|
|
if model_num == 5: |
|
|
param_grid = {'n_estimators': range(10, 100, 5), |
|
|
'criterion': ['gini', 'entropy'], |
|
|
'max_depth': range(1, 20), |
|
|
'min_samples_split': range(2, 10, 1), |
|
|
'max_features': ['auto', 'sqrt', 'log2'] |
|
|
} |
|
|
grid = GridSearchCV(RandomForestClassifier(random_state=0, n_jobs=10), |
|
|
param_grid, refit=True, n_jobs=10) |
|
|
grid.fit(X_train, y_train) |
|
|
return grid |
|
|
# ADA Boost |
|
|
if model_num == 6: |
|
|
param_grid = {'n_estimators': range(10, 100, 5), |
|
|
'algorithm': ['SAMME', 'SAMME.R'], |
|
|
'learning_rate': [0.001, 0.01, 0.1, 1.0, 10.0] |
|
|
} |
|
|
grid = GridSearchCV(AdaBoostClassifier(random_state=0), |
|
|
param_grid, refit=True) |
|
|
grid.fit(X_train, y_train) |
|
|
return grid |
|
|
# Gradient Boost |
|
|
if model_num == 7: |
|
|
param_grid = {'n_estimators': [100, 150, 200, 250, 300, 350, 400, 450, 500], |
|
|
'criterion': ['friedman_mse', 'mse'], |
|
|
'max_features': ['sqrt', 'log2'], |
|
|
'learning_rate': [0.001, 0.01, 0.1, 1.0, 2, 4], |
|
|
'loss': ['deviance', 'exponential'], |
|
|
} |
|
|
grid = GridSearchCV(GradientBoostingClassifier( |
|
|
random_state=0), param_grid, refit=True) |
|
|
grid.fit(X_train, y_train) |
|
|
return grid |
|
# XG Boost |
||
if model_num == 8: |
||
param_grid = {'n_estimators': range(0, 100, 5), |
||
'learning_rate': [0.01, 0.1], |
||
'max_depth': range(2, 6) |
||
} |
||
grid = GridSearchCV(XGBClassifier(random_state=0), |
||
param_grid, refit=True) |
||
grid.fit(X_train, y_train) |
||
return grid |
||
# Neural Net |
||
if model_num == 9: |
||
param_grid = {'alpha': [x * 0.0001 for x in range(1, 1000, 2)], |
||
'learning_rate_init': [0.01, 0.001, 0.0001], |
||
'max_iter': [100, 300, 1000], |
||
'hidden_layer_sizes': [(128, 64, 32), (128, 32, 8), (256, 128, 64, 32), (64, 32, 16), (64, 32), (64, 16)], |
||
} |
||
grid = GridSearchCV(MLPClassifier(random_state=42), |
||
param_grid, refit=True, n_jobs=10) |
||
grid.fit(X_train, y_train) |
||
return grid |
||
B. generate_weights.py
import argparse import pickle |
import sys |
from best_params import best_model |
if name == " main ": |
# Adding Arguments to make user select one model to run results on |
parser = argparse.ArgumentParser( |
formatter_class=argparse.RawTextHelpFormatter) |
model_dict = {1: "naive_bayes", 2: "logistic_regression", 3: "decision_tree", |
4: "svm", 5: "random_forest", 6: "ada_boost", 7: "gradient_boost", 8: "xg_boost", 9: "neural_net", 10: "all"} |
parser.add_argument( |
"-m", "--model", type=int, choices=list(range(1, len(model_dict)+1)), help="\ |
\n1. Naive Bayes \ |
\n2. Logistic Regression \ |
\n3. Decision Tree\ |
\n4. SVM\ |
\n5. Random Forest\ |
\n6. AdaBoost\ |
\n7. Gradient Boost\ |
\n8. XGBoost\ |
\n9. Neural Network\ |
\n10. All Models\ |
") |
args = parser.parse_args() |
if not args.model: |
print("Model not provided. See run_model.py --help") |
sys.exit() |
def save_weights(model_num): |
model = best_model(model_num) |
filename = f"./weights/{model_dict[model_num]}.sav" |
print(f"\nSaving {filename} ...") |
pickle.dump(model, open(filename, 'wb')) |
# save the model to disk |
if(model_dict[args.model] != "all"): |
save_weights(args.model) |
else: |
# Run for all models |
for model_num in range(1, len(model_dict)): |
save_weights(model_num) |
else: |
print("Run the script run_model.py to obtain results.") |
C. pre_processing.py
import pandas as pd import numpy as np |
|
from sklearn.feature_selection import f_classif, SelectKBest |
|
from sklearn.preprocessing import StandardScaler |
|
from sklearn.decomposition import PCA |
|
df = pd.read_csv('./dataset/pd_speech_features.csv') |
|
id = df['id'] |
|
label = df['class'] |
|
n = len(id) |
|
id_label = {} |
|
for i in range(n): |
|
id_label[id[i]] = label[i] |
|
train_healthy_id = [] |
|
test_healthy_id = [] |
|
train_pd_id = [] |
|
test_pd_id = [] |
|
train_n_healthy = 58 |
|
test_n_healthy = 6 |
|
train_n_pd = 170 |
|
test_n_pd = 18 |
|
train_count_healthy = 0 |
|
test_count_healthy = 0 |
|
train_count_pd = 0 |
|
test_count_pd = 0 |
|
for key, val in id_label.items(): |
|
if val == 0: |
|
if train_count_healthy < train_n_healthy: |
|
train_healthy_id.append(key) |
|
train_count_healthy += 1 |
|
else: |
|
test_healthy_id.append(key) |
|
train_count_healthy += 1 |
|
else: |
|
if train_count_pd < train_n_pd: |
|
train_pd_id.append(key) |
|
train_count_pd += 1 |
|
else: |
|
test_pd_id.append(key) |
|
test_count_pd += 1 |
|
train_healthy_id.extend(train_pd_id) |
|
test_healthy_id.extend(test_pd_id) |
|
train_id = train_healthy_id |
|
test_id = test_healthy_id |
|
train_id = np.array(train_id) |
|
test_id = np.array(test_id) |
|
trainX = None |
|
trainY = None |
|
for i in range(train_id.shape[0]): |
|
vals = df[df['id'] == train_id[i]].values |
|
if i == 0: |
|
trainX = vals[:, 1:-1] |
|
trainY = vals[:, -1].reshape((-1, 1)) |
|
else: |
|
trainX = np.append(trainX, vals[:, 1:-1], axis=0) |
|
trainY = np.append(trainY, vals[:, -1].reshape((-1, 1)), axis=0) |
|
testX = None |
|
testY = None |
|
for i in range(test_id.shape[0]): |
|
vals = df[df['id'] == test_id[i]].values |
|
if i == 0: |
|
testX = vals[:, 1:-1] |
|
testY = vals[:, -1].reshape((-1, 1)) |
|
else: |
|
testX = np.append(testX, vals[:, 1:-1], axis=0) |
|
testY = np.append(testY, vals[:, -1].reshape((-1, 1)), axis=0) |
|
trainY = trainY.reshape((-1,)) |
|
trainX_copy = trainX |
|
trainY_copy = trainY |
|
testX_copy = testX |
|
testY_copy = testY |
|
sel = SelectKBest(f_classif, k=490) |
|
trainX_copy = sel.fit_transform(trainX_copy, trainY_copy) |
|
testX_copy = sel.transform(testX_copy) |
|
sc = StandardScaler() |
|
trainX_copy = sc.fit_transform(trainX_copy) |
|
testX_copy = sc.transform(testX_copy) |
|
pca = PCA(n_components=0.99) |
|
trainX_copy = pca.fit_transform(trainX_copy) |
|
testX_copy = pca.transform(testX_copy) |
|
if name == " main ": |
|
print("Data Pre-processed Succesfully. Run the script run_model.py") |
|
else: |
|
X_train, X_test, y_test, y_train = trainX_copy, testX_copy, testY_copy, trainY_copy |
import argparse import pickle |
import Joblib |
import sys |
# Loading the dataset |
from pre_processing import X_train, X_test, y_test, y_train |
# Adding Arguments to make user select one model to run results on |
parser = argparse.ArgumentParser(formatter_class=argparse.RawTextHelpFormatter) |
model_dict = {1: "naive_bayes", 2: "logistic_regression", 3: "decision_tree", |
4: "svm", 5: "random_forest", 6: "ada_boost", 7: "gradient_boost", 8: "xg_boost", 9: "neural_net", 10: "all"} |
parser.add_argument( |
"-m", "--model", type=int, choices=list(range(1, len(model_dict)+1)), help="\ |
\n1. Naive Bayes \ |
\n2. Logistic Regression \ |
\n3. Decision Tree\ |
\n4. SVM\ |
\n5. Random Forest\ |
\n6. AdaBoost\ |
\n7. Gradient Boost\ |
\n8. XGBoost\ |
\n9. Neural Network\ |
\n10. All Models\ |
") |
args = parser.parse_args() |
if not args.model: |
print("Model not provided. See run_model.py --help") |
sys.exit() |
# Loading and Running the Model |
def load_and_predict(model_num): |
filename = f"./weights/{model_dict[model_num]}.sav" |
print(f"\nLoading {filename} ...") |
loaded_model = pickle.load(open(filename, 'rb')) |
result = loaded_model.score(X_test, y_test) |
print(f"({model_dict[model_num]}) Model Accuracy: {result}%") |
if(model_dict[args.model] != "all"): |
load_and_predict(args.model) |
else: |
# Run for all models |
for model_num in range(1, len(model_dict)): |
load_and_predict(model_num) |
from pre_processing import X_train, X_test, y_test, y_train import pandas as pd |
|
import numpy as np |
|
import seaborn as sns |
|
import matplotlib.pyplot as plt |
|
from sklearn.manifold import TSNE |
|
# Perform EDA |
|
df = pd.read_csv('./dataset/pd_speech_features.csv') |
|
print('Dataset Shape:') |
|
print(df.shape) |
|
print() |
|
print('Class Distribution:') |
|
print(df['class'].value_counts()) |
|
print() |
|
positives = df[df['class'] == 1] |
|
negatives = df[df['class'] == 0] |
|
print('Positive Class (Patients with PD) Gender Distribution') |
|
print(positives['gender'].value_counts()) |
|
print() |
|
print('Negative Class (Patients without PD) Gender Distribution') |
|
print(negatives['gender'].value_counts()) |
|
print() |
|
# Number of patients counts |
|
def print_patient_dist(df): |
|
ids = np.unique(df['id'].values) |
|
fem = 0 |
|
male = 0 |
|
for i in ids: |
|
if df[df['id'] == i]['gender'].iloc[0] == 0: |
|
fem += 1 |
|
else: |
|
male += 1 |
|
print('Unique Males: ', male) |
|
print('Unique Females: ', fem) |
|
print('Positive Class Unique Patients Distribution') |
|
print_patient_dist(positives) |
|
print() |
|
print('Negative Class Unique Patients Distribution') |
|
print_patient_dist(negatives) |
|
print() |
|
print('Database Info') |
|
print(df.info()) |
|
print() |
|
# Plot distribution of 3 important features |
|
df_selected = df[['DFA', 'mean_MFCC_2nd_coef', 'tqwt_entropy_log_dec_12']] |
|
print(df_selected.describe()) |
|
plt.gcf().set_size_inches(12, 5) |
|
ax = sns.histplot(df['DFA'], color='#3EADA7', alpha=1, edgecolor='white') |
|
ax.set_title('DCA') |
|
plt.show() |
|
plt.gcf().set_size_inches(12, 5) |
|
ax = sns.histplot(df['mean_MFCC_2nd_coef'], |
|
color='#3EADA7', alpha=1, edgecolor='white') |
|
ax.set_title('mean_MFCC_2nd_coef') |
|
plt.show() |
|
plt.gcf().set_size_inches(12, 5) |
|
ax = sns.histplot(df['tqwt_entropy_log_dec_12'], |
|
color='#3EADA7', alpha=1, edgecolor='white') |
|
ax.set_title('tqwt_entropy_log_dec_12') |
|
plt.show() |
|
# Peform tsne visualization on the preprocessed data |
|
tsne_results = TSNE(n_components=2, random_state=1).fit_transform(X_train) |
|
DF = pd.DataFrame({'y': y_train.reshape((-1,))}) |
|
DF['tsne1'] = tsne_results[:, 0] |
|
DF['tsne2'] = tsne_results[:, 1] |
|
plt.figure(figsize=(5, 5)) |
|
ax = sns.scatterplot( |
|
x="tsne1", y="tsne2", |
|
hue="y", |
|
palette=sns.color_palette("hls", 2), |
|
data=DF, |
|
legend=False, |
|
alpha=1 |
|
) |
|
ax.set_title('TSNE Visualization') |
|
plt.show() |
Parkinson’s Disease is a totally grave disease and has no cure till date. since it impacts the actions of the parts of the body, the speech additionally stands affected. here, the gadget tries to offer a way of detecting Parkinson’s ailment so one can bring about a quick action to reduce or even put off it from affecting the whole body. This gadget aims to make this method of expertise a case of Parkinson’s on the earliest via each, the affected person as well as scientific experts. hence, the goal is to apply numerous machine getting to know strategies like SVM, choice Tree, for buying the maximum accurate result. Here using Decision Tree and building a classifier results in an accuracy of 94.7774555%.
[1] Adrien Payan, Giovanni Montana, Predicting Alzheimer\'s disease: a neuroimaging study with 3D convolutional neural networks. [2] Alemami, Y. and Almazaydeh, L. (2020) Detecting of Parkinson Disease through Voice Signal Features. Journal of American Science, [3] Fayao Liu, Chunhua Shen, Learning Deep Convolutional Features for MRI Based Alzheimer’s Disease Classification. [4] Hadjahamadi, A.H. and Askari, T.J. (2020) A Detection Support System for Parkinson’s Disease Diagnosis Using Classification and Regression Tree. Journal of Mathematics and Computer Science , 4, 257-263. [5] Little, M.A., McSharry, P.E., Hunter, E.J. and Ramig, L.O. (2018), Suitability of Dysphonia Measurements for Telemonitoring of Parkinson’s disease. IEEE Transactions on Biomed ical Engineering, 56, 1015-1022. [6] Muhlenbach, F. and Rakotomalala, R. (2019) Discretization of Continuous Attributes. In: Wang, J., Ed., Encyclopedia of Data Warehousing and Mining, Idea Group Reference, 397-402. [7] A. J. Espay, ‘‘Revisiting protein aggregation as pathogenic in sporadic parkinson and alzheimer diseases,’’ Neurology, vol. 92, no. 7, pp. 329–337, Feb. 2019. [8] B. M. Bot, C. Suver, E. C. Neto, M. Kellen, A. Klein, C. Bare, M. Doerr, A. Pratap, J. Wilbanks, E. R. Dorsey, S. H. Friend, and A. D. Trister, ‘‘The mPower study, parkinson disease mobile data collected using ResearchKit,’’ Sci. Data, vol. 3, no. 1, Dec. 2019, Art. no. 160011. [9] A. E. P. Bouwmans, A. M. M. Vlaar, W. H. Mess, A. Kessels, and W. [10] E. J. Weber, ‘‘Specificity and sensitivity of transcranial sonography of the substantia nigra in the diagnosis of Parkinson’s disease: Prospective cohort study in 196 patients,’’ BMJ Open, vol. 3, no. 4, 2020, Art. no. e002613. [11] Pickle, N. T., Shearin, S. M., and Fey, N. P. Dynamic neural network approach to targeted balance assessment of individuals with and without neurological disease during non-steady-state locomotion. Journal of Neuroengineering and Rehabilitation 16 (JUL 12 2020), 88. PT: J; NR: 31; TC: 0; J9: J NEUROENG REHABIL; PG: 9; GA: IJ0RY; UT: WOS:000475608600003.
Copyright © 2022 Mohesh T, Gowtham K, Vijeesh P, Arun Kumar S. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET44075
Publish Date : 2022-06-10
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here