Comprehensive Survey on Disease Detection in Retinal Imaging through Deep Learning Techniques

Authors: Nikhil Deore, Vedant Inamdar, Anurag Nimkar, Gaurav Jagdale, Prof. Nikita Kolambe

DOI Link: https://doi.org/10.22214/ijraset.2024.58482

Abstract

Millions of people around the world suffer from visual impairment and blindness that could have been prevented or diagnosed earlier. To tackle this issue, we’ve introduced a groundbreaking approach to detect multiple eye diseases using retinal imaging. Our study involves a unique combination of various advanced deep learning models through ensemble learning.We present a detailed examination of deep learning models in the context of multi-disease classification using the Retinal Fundus MultiDisease Image Dataset (RFMiD). The objective is to assess the performance of these models and identify the most effective one for accurate and reliable disease diagnosis.We begin by preprocessing the RFMiD dataset to enhance image quality and extract relevant features, ensuring a robust input for our models. Subsequently, each model undergoes a comprehensive training and finetuning process to optimize its parameters for disease classification. The evaluation metrics include accuracy, precision, recall, and F1 score, providing a comprehensive understanding of model performance.We discuss the strengths and weaknesses of each model, shedding light on factors influencing their performance. The insights gained from this comparative analysis can guide researchers and practitioners in selecting an appropriate model for retinal disease diagnosis based on specific requirements and constraints.In addition, we explore potential avenues for future research, including ensemble methods and hybrid architectures, to further improve classification accuracy. The paper includes discussion on the practical implications of our findings and their significance in advancing the field of medical image analysis, particularly in the context of retinal disease diagnosis.

Introduction

I. INTRODUCTION

In contemporary society, the global prevalence of impaired vision has reached a staggering 2.2 billion individuals, predominantly attributed to ophthalmic diseases. The pivotal role of early diagnosis in alleviating patient suffering and preventing further vision deterioration is widely acknowledged. Current practices involve manual screening by proficient ophthalmologists who identify retinal abnormalities through the examination of fundus images. However, this method is inherently reliant on specialized human resources, leading to potential shortages. Consequently, the exploration of automatic retinal abnormalities detection has emerged as a significant and impactful research pursuit. While previous studies have employed image processing techniques to extract features from fundus images], reliance on manually defined features may overlook crucial information. Recent advancements in deep learning models have demonstrated superiority in extracting hidden features from high dimensional fundus images . Despite notable achievements, these studies often focus on a singletype data for specific retinal diseases, limiting their applicability in clinical settings. Hence, there arises a profound need to devise an effective multitask disease diagnosis network. By leveraging various types of data and employing an ensemble of deep learning techniques, our framework aims to identify and predict multiple ophthalmic diseases simultaneously, fostering a more comprehensive and impactful solution for early disease detection.

II. METHODS

A. Data Preprocessing

In the preliminary phase of code execution, deliberate efforts are allocated towards the nuanced domain of data preprocessing, a sine qua non for subsequent model training. The deployment of TensorFlow’s image dataset from directory is instrumental in orchestrating the uniform resizing of images to standardized dimensions of (224, 224) pixels.

A meticulous examination of the dataset ensues, providing a foundational comprehension of its inherent composition. Subsequent to this evaluative phase, pixel values undergo normalization, a crucial imperative towards ensuring uniformity by judiciously scaling within the interval [0, 1]. The astute stratification of the dataset into training, validation, and test subsets is executed with precision, manifesting a methodical and comprehensive evaluation environment for the model. This strategic partitioning establishes a robust underpinning for a measured and informed training regimen, laying the groundwork for subsequent stages in the image classification model.

B. Feature Extraction

As the model architecture coalesces, emphasis pivots towards feature extraction – a pivotal facet integral to elucidating intricate patterns latent within the images. Convolutional layers, deemed keystones within the model framework, intrinsically unravel hierarchical features embedded within the input images. Synergistically complemented by maxpooling layers orchestrating spatial downsampling, these convolutional strata serve as the quintessential bedrock for adept feature acquisition. The Flatten layer, occupying a pivotal position, orchestrates the metamorphosis of extracted 2D feature maps into a streamlined 1D vector, seamlessly assimilated into ensuing dense layers. This meticulously orchestrated progression equips the model with the cognitive prowess to discern nuanced structures and patterns, thereby constituting an indispensable facet in achieving precision within the ambit of image classification. The binary stratagem of methodical data preprocessing and discerning feature extraction substantiates the foundation for an adroit and sophisticated CNN.

C. Deep Learning Models

CNN: A Convolutional Neural Network (CNN) is constructed using the Keras API with TensorFlow as the backend for the classification of retinal images associated with eye diseases. The CNN architecture comprises an input layer with a shape of (224, 224, 3) representing image dimensions and RGB color channels, followed by three sets of convolutional layers, each with ReLU activation and ’same’ padding[9]. The first set includes a Conv2D layer with 64 filters, followed by two additional sets, each consisting of a Conv2D layer with 64 filters. MaxPooling layers are inserted after each set of convolutional layers to reduce spatial dimensions. Further, an additional set of convolutional layers is introduced, featuring a Conv2D layer with 128 filters, followed by two more Conv2D layers with 128 filters each. MaxPooling is applied once more, and a Flatten layer is employed to transform the 2D feature maps into a 1D vector. The architecture includes two Dense layers: the first with 256 neurons and ReLU activation for feature extraction, and the second with 4 neurons and softmax activation for multiclass classification (Cataract, Diabetic retinopathy, Glaucoma, Normal). The model is compiled using sparse categorical crossentropy loss, the Adam optimizer, and accuracy as the evaluation metric. The summary of the model’s architecture is displayed, and the model is visualized and saved to ’simple-cnn.png’. Subsequently, the model is trained for 10 epochs on the provided training dataset, with validation conducted on a separate validation set. This CNN is designed to learn and distinguish between different eye diseases based on the characteristics of the input images.[10]
EfficientNetB7: EfficientNet-B7 provides a framework that combines depth, breadth, and resolution to obtain other models. EfficientNet is a convolutional neural network architecture and scaling method that scales all depth/width/resolution dimensions equally using composite coefficients [1]. Unlike traditional methods of scaling this scale, the EfficientNet scaling method scales the network width, depth, and resolution evenly with fixed scaling factors. Images were classified with different hyperparameters using the EfficientNet-B7 algorithm. When designing a model, the first choice we make for hyperparameters often does not lead us to the right one [2]. Therefore, the performance of the model is shown by changing the hyperparameters. Select the most appropriate set of hyperparameters for the dataset. Adjusting the learning coefficients in the optimization algorithm plays an important role in the training model. Using Modified Adam optimiser as a learning algorithm. EfficientNet-B7 generally consists of 1 trunk, 7 blocks and 1 end layer [3]. The seven blocks usually include modules 1, 2 and 3. All modules consist of convolutional layers, pooling layers, and smoothing layers [4].
DenseNet201: DenseNet201 embraces a dense connectivity approach, forming a tight-knit network where each layer connects directly with every other layer. This interlayer collaboration encourages efficient feature reuse and keeps the flow of information smooth and effective.Featuring bottleneck layers, DenseNet201 combines 1x1 and 3x3 convolutions cleverly. This design trims down the number of input feature maps before passing them to the next dense block, making computations more efficient while preserving the model’s ability to represent complex features[5].The model is organized into dense blocks, mini-neighborhoods where layers work closely together. Within each dense block, information from all preceding layers is mixed and mashed, creating a robust blend of features. This dense collaboration helps the model to pick up on nuanced patterns in the data[5][6].The final layers of DenseNet201 employ global average pooling, a cool trick that takes the average of each feature map across its spatial dimensions. This condensed representation becomes the input for the last fully connected layer, simplifying the decision making process.In our research, DenseNet201 stole the show, outperforming other models. Its knack for leveraging dense connectivity and learning intricate hierarchical representations proved instrumental in accurately classifying various diseases within the RFMiD dataset[6].
VGG19: In the specialized field of eye disease classification and prediction, the VGG19 model, renowned for its effectiveness in image classification tasks, is a pivotal tool.Leveraging its pretrained weights from the ImageNet dataset, the VGG architecture excels in discerning intricate visual features, making it particularly suitable for the nuanced analysis of medical images. Eye disease classification demands a keen understanding of subtle visual cues indicative of various conditions [7].The VGG model’s ability to recognize diverse and complex patterns in images makes it an invaluable tool for this purpose. By excluding the top layers and preserving the lower layers’ learned features, the model’s knowledge base is retained while allowing for task-specific adaptation. The decision to freeze the VGG19 layers is strategic; it ensures that the model capitalizes on its preexisting knowledge of general visual features learned during ImageNet training. This transfer learning approach enables the model to generalize well to the specific intricacies of eye disease images, contributing to enhanced classification accuracy. In the constructed architecture, the frozen VGG base is augmented with additional layers tailored for the eye disease classification task. These include a Flatten layer, a Dropout layer for regularization, and Dense layers for feature aggregation and classification. The model is then compiled with suitable parameters, such as the Adam optimizer and sparse categorical cross entropy loss, optimizing it for accurate prediction. By employing the VGG model in eye disease classification and prediction, researchers and practitioners benefit from a robust framework that combines the model’s innate ability to understand complex visual patterns with task specific adaptations, ultimately contributing to advancements in ophthalmic diagnostics and patient care[8].

Conclusion

These deep learning methods have been shown to be effective. In medical practice, these in depth studies provide insight that doctors use to diagnose certain diseases. They contribute to the diagnosis and prognosis of ophthalmic patients. As the stability and efficiency of deep learning algorithms continue to increase, their effects need to be widely recognized, opening the door to their applications in other important therapeutic areas. Our deep learning method, which includes a set of deep learning models, shows good performance in detecting various diseases in retinal fundus images.

References

[1] M. Tan and Q. V. Le, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks,” 36th Int. Conf. Mach. Learn. ICML 2019, vol. 2019 -June, pp. 10691 –10700, May 2019, Accessed: Feb. 27, 2021. [2] B. Bulut, V. Kal?n, B. B. Güne?, and R. Khazhin, “Deep Learning Approach For Detection Of Retinal Abnormalities Based On Color Fundus Images,” in 2020 Innovations in Intelligent Systems and Applications Conference (ASYU), Oct. 2020, pp. 1–6, doi: 10.1109/ASYU50717.2020.9259870. [3] 28. Tan M, Le Q. Efficientnet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning. Los Angeles, CA: PMLR; (2019). p. 6105–14. [4] Zhu S, Lu B, Wang C, Wu M, Zheng B, Jiang Q, Wei R, Cao Q, Yang W. Screening of Common Retinal Diseases Using Six-Category Models Based on EfficientNet. Front Med (Lausanne). 2022 Feb 23;9:808402. doi: 10.3389/fmed.2022.808402. PMID: 35280876; PMCID: PMC8904395. [5] N. Deepa and D. T, \"E-TLCNN Classification using DenseNet on Various Features of Hypertensive Retinopathy (HR) for Predicting the Accuracy,\" 2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, 2021, pp. 1648-1652, doi: 10.1109/ICICCS51141.2021.9432255. [6] B. Goutam, M. F. Hashmi, Z. W. Geem and N. D. Bokde, \"A Comprehensive Review of Deep Learning Strategies in Retinal Disease Diagnosis Using Fundus Images,\" in IEEE Access, vol. 10, pp. 57796-57823, 2022, doi: 10.1109/ACCESS.2022.3178372. [7] Amit Choudhary,Savita Ahlawat, Shabana Urooj, Nitish Pathak, Aimé Lay-Ekuakille Neelam Sharma,,“A Deep Learning-Based Framework for Retinal Disease Classification ”, Multidisciplinary Digital Publishing Institute Journals Healthcare Volume 11 Issue 2 10.3390/healthcare11020212 ,Accepted: 29 December 2022, Published: 10 January 2023. [8] S. N. Shivappriya, H. Rajaguru, M. Ramya, U. Asiyabegum and D. Prasanth, \"Disease Prediction based on Retinal Images,\" 2021 Smart Technologies, Communication and Robotics (STCR), Sathyamangalam, India, 2021, pp. 1-6, doi: 10.1109/STCR51658.2021.9588829. [9] A. Nawaz, T. Ali, G. Mustafa, M. Babar and B. Qureshi, \"Multi-Class Retinal Diseases Detection Using Deep CNN With Minimal Memory Consumption,\" in IEEE Access, vol. 11, pp. 56170-56180, 2023, doi: 10.1109/ACCESS.2023.3281859. [10] Zhao et al., \"Multi-task Learning Based on Multi-type Dataset for Retinal Abnormality Detection,\" 2021 IEEE International Conference on Digital Health (ICDH), Chicago, IL, USA, 2021, pp. 160-165, doi: 10.1109/ICDH52753.2021.00029.

Copyright

Copyright © 2024 Nikhil Deore, Vedant Inamdar, Anurag Nimkar, Gaurav Jagdale, Prof. Nikita Kolambe. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET58482

Publish Date : 2024-02-18

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here