Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Sanjeet Pandey, Er. Paritosh Tripathi , Er. Vineet Kumar Singh, Vishal Sharma
DOI Link: https://doi.org/10.22214/ijraset.2021.39194
Certificate: View Certificate
Brain is recognized as one of the complex organ of the human body. Abnormal formation of cells may affect the normal functioning of the brain. These abnormal cells may belong to category of benign cells resulting in low grade glioma or malignant cells resulting in high grade glioma. The treatment plans vary according to grade of glioma detected. This results in need of precise glioma grading. As per World Health Organization, biopsy is considered to be gold standard in glioma grading. Biopsy is an invasive procedure which may contains sampling errors. Biopsy may also contain subjectivity errors. This motivated the clinician to look for other methods which may overcome the limitations of biopsy reports. Machine learning and deep learning approaches using MRI is considered to be most promising alternative approach reported by scientist in literature. The presented work were based on the concept of AdaBoost approach which is an ensemble learning approach. The developed model was optimized w.r.t to two hyper parameters i.e. no. of estimators and learning rate keeping the base model fixed. The decision tree was used as a base model. The proposed developed model was trained and validated on BraTS 2018 dataset. The developed optimized model achieves reasonable accuracy in carrying out classification task i.e. high grade glioma vs. low grade glioma.
I. INTRODUCTION
Brain is considered to be one of the complex organ of the body. If occurrence of uncontrolled division of cell takes place within the brain due to which abnormal formation of group of cells results in brain tumor. Tumor is considered to be life threatening disease. This abnormal growth of the cell may affects the normal functioning (figure-1).
Brain tumors were majorly classified in low grade tumor and high grade tumor. Grade I and Grade II tumors are considered to be low grade tumors and Grade III and Grade IV tumors are considered to be high grade tumor [4]. Low grade tumor are considered to be non-cancerous or in other words less aggressive in comparison to high grade tumor. Exact causes of brain tumors are unknown till date and researchers are conducting research to know the causes of brain tumor [5-7, 16,17]. Some of the symptoms of brain tumor includes: headache, difficulty in speaking, loss of movement etc. Interesting thing about brain tumor is that sometimes it does not shows the above mentioned symptoms and can discovered accidently.
Inorder to detect the tumor doctor may conduct investigations which may include imaging scans or biopsy or combination of both. Once tumor presence is confirmed doctor may plan treatment and follow-required in process.
Magnetic resonance imaging is considered to be one of the favorite choice of investigation [1-15]. Figure below shows the some of the conventional MRI sequences such as T2, FLAIR and T1 CE respectively with tumor.
Once tumor presence is confirmed in the MRI, clinician may plan biopsy to know the type and grade of the tumor. Sometimes repeated biopsies may be performed by the clinicians when tumor tissues were not enough to define the type or grade of tumor or if there was any confusions. Biopsy is an invasive procedure and may involve subjective and sampling errors. Errors in investigation procedure may affects the clinical treatment planning and follow-ups [1-9].
Once the tumor presence was observed by the clinician, he or she may plan further treatment. In most of the carried out work by the researchers, efforts had been put inorder to define the tumor boundaries followed by their classification with the help of different sequences of MRI. Some of the common sequence which are used in clinical practices for diagnosing brain tumor and defining tumor volume includes T1, T1 contrast, FLAIR, T2, PD, DWI etc. General procedure involves preprocessing steps followed by segmentation followed by classifying tumors. Preprocessing involves skull removal, noise removal, and contrast enhancement etc. with the help of well-established methods such as fast non-local mean (FNLM) wiener filter. Segmentation involves extraction of different components of tumors such as enhancing components, non-enhancing, necrotic and edema portion. Some of the common methods followed by several researcher for performing segmentation task includes thresholding, K-means clustering, support vector machine (SVM), Random Forest, U-net etc. Once the different components of tumor has been segmented out, multiple features such as perfusion, texture features etc. were extracted. Perfusion features includes tracer kinetic and hemodynamic parameters. Texture feature includes computation of HOG (histogram orientation gradient), local binary patterns (LBP), Gabor Wavelet Transform (GWT). These features acts as an input to tumor classifier models. Feature selection are performed before feeding as input to classifier. Feature selection removes the redundant feature hence improving the classifier decisions.
Researchers were worked and still are working in the direction to address on questions like: can invasive biopsies be replaced, can sampling errors may be reduced etc. Clinicians, scientist and engineers from cross disciplinary areas are working in this direction. MRI investigations is considered to be non-invasive procedure [7-10].
Quantitative features which were extracted from MRI, were investigated as it is or with the help of machine leaning or deep leaning or transfer learning or any other procedure to identify the type and grade of glioma. Positive results which were obtained with the help of machine learning or deep learning motivates the researcher to further investigate and improve the results in this direction. Some of the challenges which were mentioned by researchers in their findings were: limited data set, class imbalance error, subjectivity involve in tumor segmentation, cost & time etc.
In the proposed work, a hypothesis was presented which tries to differentiate the low grade gliomas from high grade glioma using the conventional MRI sequences using the texture features. AdaBoost algorithm was used to perform this classification task. Pearson correlation coefficient was used to select the features that will contribute in the classification task. Finally a 10-Folds cross validation was used to validate the trained model. Developed model is tested against the out of sample errors for recording the accuracy
II. LITERATURE SURVEY
This section describes the available literature work in the area of gliomas classification i.e. HGG vs LGG. Fusun et al .in their work used advanced sequences i.e. diffusion tensor, perfusion etc. along with convention imaging in making differentiation between LGG vas HGG [7]. As per findings reported, ADC values are higher in low grade gliomas in comparison to high grade gliomas. Significant difference has been reported in lipid peaks using MR spectroscopy between low grade and high grade gliomas.
Authors in their work [8] used the conventional MRI sequences along with advanced MRI sequences such as diffusion weighted imaging in classifying gliomas into LGG vs. HGG. Leave one out CV (LOOCV) approach has been applied to validate the developed 25 different models. A total 25 models were developed, validated and tested. Among all developed models support vector machine performed better. The reported accuracy in their findings were 94.5% (SVM). Shoaib et al. in their work carried out the similar task and reported the accuracy equal to 80.65% [9].
A Vamvakas et al. in their studies reported the classification accuracy equal to 95.5% [10]. They have used MRI conventional sequences, advanced sequences plus spectroscopy findings in their carried out study. Their proposed study was based on the total of 40 patients. Support vector machine concept was used to develop the classification model. For validating their proposed model, LOOCV (Leave One Out Cross Validation) approach was used.
Y. Yang et al. in their study used the concept of transfer learning in carrying out the classification task i.e. LGG vs. HGG. They used MRI conventional sequences in their study. Their study was based on a total of 113 glioma patients. They developed two different models which were based on the concept of AlexNet and GooGleNet. Five folds cross validation approach was used in validating the developed model. As per their findings GooGleNet performs better in comparison to AlexNet. Their reported accuracy was 86.7% [11].
W. Chen et al. in their study investigated the role of Radiomics in classification task i.e. LGG vs. HGG [12]. They have conducted their study using BraTS 2015 data set. For feature selection, authors has used the SVM-RFE approach. The classification model developed were based on the concept of extreme gradient boosting algorithm. (XGBoost).
Zurfi et al. [13] in their studies used 3D texture analysis in gliomas grading task with the help of machine learning. Texture features were used in classification task on order to differentiate between low grade gliomas from high grade gliomas. ANOVA concept was used in order to remove the redundant features. Authors trained and validated their model using BraTS 2013 database. Ensemble approach based on decision tree as a base classifier was used to develop the final classification model. With the help of developed model, authors were able to achieve the accuracy equal to 0.96%.
Authors [14, 15] in their work used the Radiomics features which when fed as input to machine leaning algorithms to carry or gliomas grading task.
Authors [15] in their carried out work proposed the model named ‘Intensity-Volume-LBP-PCA-KNN’ were used to carry out the differentiation between low grade gliomas from high grade gliomas. For conducting their study they have used the BraTS 2015 dataset. Principal component analysis concept was used to carry out the reduction in data dimension. KNN concept was used to develop the classification model. They developed model achieved the classification accuracy equal to 87.59%.
Authors [23] in their carried out work proposed classification model which were based on the concept of CNN i.e. Convolutional Neural Network. For developing their model, BraTS 2019 data set was used. For validating their models they have used Cancer Imaging archive data set. Reported area under the curve was equal to 0.93.
Authors [24] in their proposed study developed a classifier in order to differentiate between LGG vs HGG. Model was developed based on the concept of transfer learning approach. A pretrained model V3 CNN were used to develop the classification model. Their proposed model was trained and tested using BraTS 2017 and 2018 dataset. Their proposed classifier achieved the classification accuracy equal to 0.92.
Authors [25] in their proposed study develop a classifier inorder to differentiate between LGG vs HGG. Model was developed based on the concept of deep convolutional neural network. Feature selection were performed with the help of power LDP and statistical features. Their proposed classifier achieved the classification accuracy equal to 0.96.
Authors [26] in their carried out work proposed the model based on concept of transfer learning to carry out the differentiation between low grade gliomas from high grade gliomas. For conducting their study they have used the Figshare dataset. Transfer learning concept was used to develop the classification model. Pretrained model named GooGleNet was trained from scratch. 5-fold cross validation was used inorder to cross validate the developed model. Their developed model achieved the classification accuracy equal to 98%.
Authors [27] in their carried out work proposed the model based on concept of CNN architecture to carry out the differentiation between low grade gliomas from high grade gliomas. For conducting their study they have used the BraTS 2018 dataset. 3D multiscale convolutional network architecture was used to develop the classification model. Feature selection was done before developing the classification model. Their developed model achieved the classification accuracy equal to 89.47%.
Authors [28] in their carried out work proposed the model based on concept of support vector machine- recursive feature elimination architecture to carry out the differentiation between low grade gliomas from high grade gliomas. Their data set includes 43 gliomas patients. Support vector machine was used to develop the classification model. Feature selection was done before developing the classification model with the help of SVM-RFE approach. Their developed model achieved the classification accuracy equal to 93%.
Authors [29] in their carried out work, proposed the model based on concept of extreme gradient boosting architecture to carry out the differentiation between low grade gliomas from high grade gliomas. Their data set includes 662 gliomas patients. The data set on which their study was based contains 410 cases which belongs to Low Grade Gliomas and 252 cases which belongs to High Grade Gliomas. XGBoost approach was used to develop the classification model. Feature selection was done before developing the classification model with the help of Pearson Correlation approach. Their developed model achieved the classification accuracy equal to 83%.
In study [30] authors discussed the recent advances in field of gliomas grade classification.
In Similar study [31], authors conducted the survey which includes segmentation approaches followed by classification approaches.
In study [32], authors conducted the survey which includes selected methods which were developed inorder to segment the tumor components followed by their classification.
Authors [33] in their carried out work, proposed the model based on concept of CNN architecture to carry out the differentiation between low grade gliomas from high grade gliomas. Their data set includes 110 gliomas patients. The data set on which their study was based contains 110 cases which belongs to Low Grade Gliomas i.e. Grade II and Grade III. VGG 16 approach was used to develop the classification model. For developing their model authors used T1, T1-post contrast, FLAIR images. Their developed model achieved the classification accuracy equal to 83%.
Authors [34] in their carried out work, proposed the model based on concept of transfer learning architecture to carry out the differentiation between low grade gliomas from high grade gliomas. For conducting their study they have used the BraTS 2018 dataset. CovNets approach was used to develop the classification model. For developing their model authors used T1, T1-post contrast, FLAIR images. Feature selection was done before developing the classification model with the help of LOPO (Leave One Patient Out) approach. Their developed model achieved the classification accuracy equal to 95%.
Authors [35] in their carried out work, proposed the model based on concept of SVM to carry out the differentiation between low grade gliomas from high grade gliomas. For conducting their study they have used the dataset which includes 112 gliomas cases. The data set on which their study was based contains 52 cases which belongs to Low Grade Gliomas i.e. Grade II and Grade III and 22 cases belongs to High Grade Gliomas. SVM approach was used to develop the classification model. For developing their model authors used T1, T1-post contrast, FLAIR images. Feature selection was done before developing the classification model with the help of minimum Redundancy Maximum Relevance (mRMR). Their developed model achieved the classification accuracy equal to 82.5%.
Although several authors worked in this area and still research is going on. The major challenges mentioned by these authors in their manuscript were:
III. MRI PRINCIPLE
The human body is made up of billions of protons or atomic nuclei which are constituents of water or some other organic molecules. These protons or atomic nuclei possess spins (angular momentum). Due to which these atomic nuclei or protons behaves as small magnet. Similar to compass needle, in presence of external applied magnetic field these protons tend to align along the direction (parallel or antiparallel). In the absence of external field, these protons have random orientations. Once human body is put into the presence of strong external magnetic field, all spinning protons aligns themselves in parallel or antiparallel directions. This effect of aligning in presence of external applied magnetic field will create net magnetic moment inside the human body.
Let’s denote applied external magnetic field by B0and net equilibrium magnetic moment by M. With the help of Larmor equation frequency of precession can be calculated. Mathematically Larmor equation is given by:
W0=ΥB0 (1)
WhereΥ is the gyromagnetic ratio of the proton and B0 is the strength of the external magnetic field applied. For example, gyromagnetic ratio of 1H is 42.575 MHz/T and if B0 is equal to 1.5 T then frequency of precession is calculated with the Larmor equation which is equal to 42.575 x 1.5=63.8625 MHz. This frequency range lies within the range of radio frequency range of electromagnetic spectrum. If applied external magnetic field B0 is in Z direction then net M is given by
Where μn is magnetic moment vector (nth spin) and N denotes total number of spin.
For detecting MR signal, transverse magnetization is created with the help of radio frequency pulse (RF).To understand this concept let’s assume that human body is in the MRI scanner of field strength equal to 1.5 T. From above Larmor equation, frequency of precession can be calculated i.e. 63.8625 MHz.When RF pulse is applied say equal toLarmor frequency 63.8625 MHz (calculated from above), protons from whole body will respond. Idea is, apply this RF only at the slice of interest. For rest of the body, a gradient is added to main magnetic field which results in slightly addition or reduction in field strength. Now, if RF equals to 63.8625MHz is applied then it excites only those protons whose frequency of precision is 63.8625MHz. This is where the R of MRI comes from i.e. Resonance. RF is applied to create transverse magnetization. When RF is turned off, signal is detected which decays very fast (free induction decay). The two factors responsible for this decay is known as spin-2 relaxation and spin-lattice relaxation. Spin-2 relaxation is also called T2 decay and spin-lattice relaxation is called T1recovery. T2 decay is created when dephasing of millions of protons occurs.T1 time of tissue will determine the amount of time the spinning protons takes in returning back in order to align in the direction of applied magnetic field B0.The receiver coils measures the variations in transverse magnetization (FID).
A. MRI Pulse Sequences
When an external magnetic field is applied, the spinning protons will try to align with the applied magnetic field. In order to measure the MR signal, system needs to perturb. This perturbation of system is achieved with the help of RF pulse. This RF pulse will create transverse magnetization. Once this RF is off, spinning protons will return in the direction of applied magnetic field i.e. B0. When RF pulse is off, signal will decay very quickly (FID) due to T2* effects. For measuring this fast decaying signal, a very fast scanner is required. Also signal is susceptible[5] to magnetic field inhomogeneity because it depends on T2*. In order to overcome these challenges another pulse is applied later to create echo (commonly referred as spin echo).
2. Fast Spin Echo: The spin echo pulse sequence generate good signal-to-noise ratio images but it is slow. Each phase encoding step will take one TR. A typical MR images have at least 256 phase encoding steps.Roughly a TR for a T2-W sequence is two to three minutes. It means the entire image will take around 10 minutes. To speed the scanning process Fast Spin Echo (FSE) approach has been developed. In this approach as many as 180-degree pulse (Echo Train Length) can be applied after an initial 90-degree pulsefor generating echo. Increasing the ETL will decrease the scanning time (may affect the contrast). More than one line of k-space is completed per TR.
3. Gradient Recalled Echo Sequence: In this type of pulse sequence, no 180-degree refocusing pulse is required (retains T2* dephasing). With the help of gradients spin rephrasing is done. In GRE sequence, an initial α-degree pulse is applied along with slice selection gradient, phase and frequency encoding gradient. Frequency gradient diphase the spins. During the readout phase this frequency gradient is inverted, spin rephrases and an echo is created. It is much faster (not waiting for spin rephrasing after pulse). In this sequence no refocusing pulse is used, it means T2* effects are present. This will result faster FID. For this reason short TEs can be used which on other hand allow shorter TR. In simple word pulse sequence can be repeated more quickly for acquiring more phase-encoding steps.
IV. ISSUES EMERGED (GAPS IN PREVIOUS STUDIES/RESEARCH-CONCEPTUAL, METHODOLOGICAL AND THEORETICAL)
The past reviews of the literature of the research work done already shows that there is still ample scope on the topic of this research. Major issues are as follows:
A. Objective of the Research
The objective of proposed study is to harness the strength of popular machine learning approaches i.e. support vector machine, random forest, decision tree etc. in classification of gliomas into high grade and low grade.
B. Hypothesis
The presented hypothesis was based on the application of machine learning in gliomas classification into high grade and low grade.
C. Research Methodology
For carrying out the proposed study, BraTS 2018 data set has been used. BraTS 2018 data set contains the T1, T1CE, FLAIR, T2 etc. sequences [1-3]. BraTS data sets contains preprocessed data such as it is skull stripped, registered etc.
The following figure shows the research methodology which was followed in order to carry out proposed hypothesis.
Texture feature were extracted from the ROI of enhancing and non-enhancing tumor using script written in MatLab 2019a. Pearson correlation coefficient was used to select the contributing features. For training and validating model 10-Folds cross validation have been used. Finally developed model was tested on out of sample error concept.
D. Importance of Study
There is rise in the brain tumor cases around the world now a days. Classification of gliomas is important because different gliomas have different treatment strategy which helps in treatment planning. An accurate differentiation will result in better treatment planning and hence will improve the survival rate of suffering patients.
V. PROPOSED WORK
This section explains the proposed work. BraTS 2018 dataset was used for carrying out the classification task. The dataset contains 210 high grade glioma cases and 75 low grade glioma cases. For every case, the data set contains T1, post contrast T1, T2, Fluid Attenuated Inversion recovery (FLAIR) sequences. The dataset belongs to 19 different centers. The dataset was annotated into four labels:
Label-0 otherwise
Label-1 Non-enhancing tumor and necrotic region
Label-2 Edema
Label-4 Enhancing tumor
The every sequence in BraTS 2018 dataset coregistered and interpolated. Texture features were extracted from region of interest (ROI) with the help of pyradiomics using python [16, 17]. Label-1 and label 4 were combined to form ROI. A total 104 features were computed which belongs to 7 different class’s i.e. shape-based (2D), Gray level matrix (Cooccurence, Run Length, dependence and Size Zone Matrix) and Neighbouring Gray Tone difference matrix. Feature selection were made with the help of Pearson correlation coefficient method. Features were normalized using the concept of z-score. Finally 49 features were selected out of 104 computed features.
AdaBoost algorithm was used for carrying out the classification task. AdaBoost algorithm combines various weak learners to form a strong learner based on ensemble concept [18]. A 10-fold cross validation were performed to finally validate the model. The mean accuracy was calculated across the ten folds by developing different models by varying the number of estimators and learning rate (0.001 to 1). Decision tree was used as base estimator.
VI. RESULT AND DISCUSSION
Learning rate was varied from 0.001 to 1 and no of estimators were varied from 10 to 400. It was noted further increasing in number of estimates shows no improvement in accuracy and hence not shown in figure-4. From the figure-4, it was observed that model performs better when number of estimators were equal to 150. From figure-5, it was observed model performs well when learning rate was equal to 0.1. Final model was developed keeping the hyperparametes i.e. learning rate equals to 0.1 and no of estimators equals to 150. Developed model achieves the accuracy equals to 86.3% in classifying the tumor into high grade vs. low grade.
The whole thesis was organized in four major sections: introduction, related work, and proposed work and simulation results. Introduction section briefly explains the need of brain tumor diagnosis. This section also explains the limitation of biopsy procedure and hence establishes need of precise glioma classification. The section II i.e. related work discusses the some of the recent work carried out by clinician and scientist in the area of glioma classification. In section III proposed work has been discussed. AdaBoost was used as an underlying concept to develop the model to carry out the designated task. Hyper parameters were optimized and cross validated. Finally model was developed over these optimized hyper parameters keeping the base estimator same. In result section, only the final optimized model accuracy were reported along with sensitivity and specificity. The results shows the model achieved the reasonable accuracy in classifying high grade glioma from low grade glioma. Hence concludes the presented thesis.
[1] B. H. Menze, A. Jakab, S. Bauer, J. Kalpathy-Cramer, K. Farahani, J. Kirby, et al. \"The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS)\", IEEE Transactions on Medical Imaging 34(10), 1993-2024 (2015) DOI: 10.1109/TMI.2014.2377694. [2] S. Bakas, H. Akbari, A. Sotiras, M. Bilello, M. Rozycki, J.S. Kirby, et al., \"Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features\", Nature Scientific Data, 4:170117 (2017) DOI: 10.1038/sdata.2017.117. [3] S. Bakas, M. Reyes, A. Jakab, S. Bauer, M. Rempfler, A. Crimi, et al., \"Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge\", arXiv preprint arXiv:1811.02629 (2018). [4] “Gliomas | Department of Neurology.” [Online]. Available: https://www.columbianeurology.org/neurology/staywell/document.php?id=42006. [Accessed: 25-Jul-2020]. [5] All About Adult Gliomas | OncoLink.” [Online]. Available: https://www.oncolink.org/cancers/brain-tumors/all-about-adult-gliomas. [Accessed: 25-Jul-2020].
Copyright © 2022 Sanjeet Pandey, Er. Paritosh Tripathi , Er. Vineet Kumar Singh, Vishal Sharma. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET39194
Publish Date : 2021-12-01
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here