A Review: Deep Learning-Based Pneumonia Detection and Classification

Authors: Ms. Puja Thakur, Kuldeep , Pranav Singh Soam, Pratibha Rawat, Ridam Agarwal

DOI Link: https://doi.org/10.22214/ijraset.2025.66940

Abstract

Pneumonia as a disease worsens due to environmental pollution and impacts especially children, elderly, and weak immune system patients. The proper and timely diagnosis of pneumonia is very important as it reduces the risk factors that are likely to occur due to pneumonia This paper seeks to eradicate these issues using CNN architectures. pneumonia can be classified into three different types, namely, Bacterial pneumonia, Viral pneumonia, and Fungal pneumonia; the task of this thesis is to automatically detect these types of pneumonia from images of Chest X-ray. CNN (Convolutional Neural Networks) using system-aided technology of deep learning has brought significant changes in the classification of pneumonia images from x-ray of chest over the last decade. The model which was trained on a dataset of 20 thousands images having a size of (224 x 224) pixels and tested a group size of 32 provided an astounding accuracy of 95%. Two main limitations of existing models are high computational costs and logistical. Incorporating advanced algorithms such as transfer learning, pre-trained models and deep convolutional neural networks overcomes these limitations. Such AI-supporting medical systems are very effective not only to improve the early and accurate diagnosis but also to improve the availability of health care systems in developing countries.

Introduction

I. INTRODUCTION

Pneumonia has caused deaths of up to eight hundred and fifty thousand children under five in a single year and claimed the lives of a high number of children, women and elderly suffering from poor immunity around the world.

Pneumonia is an infectious disease and it can be acquired in a number of settings, for example; ventilated patients can develop it into ventilator-associated pneumonia (VAP), and patients in hospital can develop it into hospital-acquired pneumonia (HAP); however, most patients of pneumonia suffer from community-acquired pneumonia (CAP). Antibiotic-resilient microorganisms are present in HAP and VAP pique complications related to this disease. Pneumonia has caused deaths of up to eight hundred and fifty thousand children under five in a single year and claimed the lives of a high number of children, women and elderly suffering from poor immunity around the world. This disease can be as severe as the inflammation of the lung parenchyma due to bacteria, viruses, fungi or other chemicals and physical factors.

X-rays, CT scans and MRIs are other commonplace methods for diagnosing pulmonary diseases but, they are x-rays that are the cheapest and the easiest to use. However, chest X-ray images are subject to a manual evaluation which is usually done by radiologists. In developed countries pneumonia and other lung diseases such as cancer are misdiagnosed constantly because of overlap in characteristics. Remember X-ray is just an illustrative part of the evaluation. These limitations make diagnosis difficult especially for the underdeveloped nations where advanced medical care is hard to get.

So as to deal with these difficulties, this study proposes a new framework that relies on Artificial Intelligence for diagnosing lung disease using Convolutional Neural Networks (CNNs) to auto detect and classify pneumonia. The structure's performance is outstanding, with an accuracy of 96.07% and an area under the curve (AUC) of 0.9911, indicating that it is competent in reading chest X-ray pictures. The model also said to make use of transfer learning as a straightforward way to improve its flexibility while sustaining accuracy, even when trained on limited data sets, thus proving to be a strong and scalable solution. In addition, the use of telemedicine facilities helps in transferring diagnostic work from distant and low resource areas where health workers are scarce.

Although the scope of AI application in diagnosis is huge, issues such as discrepancy of data, confidentiality issues, and the rejection from some medical workers have to be handled. This research proposes ways to guarantee maintaining the quality of the collected data, compliancy with regulations, and transparency of the model to foster trust and hence adoption. This work recommends an effective methodology in reducing the burden on health care systems by enhancing the diagnostic techniques and therefore the outcome of patients across the world. The results reaffirm the importance of AI technologies in transforming the way healthcare services are provided, especially in meeting the need to rapidly and accurately detect and treat pneumonia.

II. RELATED WORKS

[1] The development of a sustainable approach to X-ray diagnosis of COVID-19 faces several significant challenges. One of the primary issues is the limited availability of data. Since the pandemic is relatively new, researchers have access to minimal datasets in the public domain. Additionally, there are very few technologies that enable rapid diagnosis and a deeper understanding of the ongoing pandemic, particularly regarding chest X-rays that can illustrate the extent of the disease's impact on the human body. After careful consideration of existing technologies, we argue that transfer learning presents a viable and practical solution to this research concern. In this approach, pneumonia images used to train deep neural networks may show infections in either of the lungs, because of bacteria, viruses, if not fungi, leading to inflammation in the alveoli, the minute alveoli within the lungs. This inflammation can cause fluid or pus to accumulate in the respiratory system, making it harder to breathe. Transfer learning emphasizes in categorization of structured viruses, validating its findings across different datasets, and ultimately aligning these insights with new datasets emerging from similar contexts—specifically, infectious diseases with rapid mutation rates. COVID-19, caused by a coronavirus, is one such infection. This solution addresses an important difficulty in deep learning: constructing a credible model from limited and under-analysed data, the characteristics of which are often unknown.

[2] Deep Convolutional Neural Networks (CNNs) work exceptionally well on larger datasets as compared to smaller sizes. This manifests since the larger sets provide a breadth of variation in examples, allowing the model to generalize better while learning complex patterns and features. Despite the considerable global volume of COVID-19 cases, the active deployment and availability of publicly available chest X-ray images remain very limited and scattered across various sources. This scenario of absence of common data poses a challenge in developing robust models of machine learning capable of accurately detecting COVID-19 using X-ray imaging. To solve the issue, the authors of this study initiated this big and well-curated database consisting of chest X-ray reports from COVID-19 affected people. This is necessary as, within multiple publicly available sets, it will be only fewer specimens that relate to COVID-19, capable of hindering the progress towards creating proper detection algorithms. This closely knitted database will be used to develop a more balanced and accurate deep learning classifier that classifies COVID-19 cases by combining available datasets of normal and pneumonia X-rays. The opinion is that these methods not only address class imbalance but also make sure that the model learns from a variety of cases around the globe which in turn improves the field efficacy of the model.

[3] The study looks into complimentary and advanced deep learning algorithms for diagnosing pneumonia using chest X-ray pictures. The Convolutional Neural Network (CNN) models applied in this analysis included, VGG-16, LeNet, ResNet-50, GoogLeNet, StridedNet and AlexNet for which the total information in terms of training images was 28,000. Each of the images included in the dataset was de-scaled to 224x224 and were processed in batch sizes of either 32 or 64. The training optimization was performed using an optimizer based on Adam with a regulated rate of learning of 1e-4 and an epoch number of 500.

In this case, the evaluation results revealed that there was high achievement with 98% for GoogLeNet and LeNet and high potentials with VGG-16, which was almost 97%. AlexNet and StridedNet averaged a percentage accuracy of approximately 96 while the Resnet-50 model performed poorly having an overall percentage accuracy of 80. Notwithstanding, six models of CNN were able to classify images of pneumonia and normals, more importantly they performed equally well. Overall, it was observed that both GoogLeNet and LeNet had good reproducibility and effectiveness for they provide confidence of their applicability in medical image analysis.

[4] The authors distinguish three major different types of chest X-ray images, particularly; COVID-19, pneumonia, and non-infected. In order to improve the correctness of the predictions, the authors used an ensemble technique during the classification phase. Each picture was definitely translated into a semi-invisible form where it was processed and classified into either COVID-19, pneumonia, or other forms which are normal. This approach, which combines the merits of several models into one for final prediction, is therefore more accurate and dependable.

[5] The first stage of the preprocessing phase consists of separating the X-ray dataset into a training subset and a testing subset. During this stage, X-ray images are also normalized and formatted to ensure uniformity across the entire dataset. Augmentation for the input X-ray reports is carried out to enhance the robustness of the model and yields new images that are variations aimed at building improved learning.

The second stage is a training process in which the model is meant to classify the X-rays in dual classes: consolidation versus non-consolidation. Therein, the method of out-of-sample testing using the k-fold is employed for validation of the model accuracy.

The last phase involves the utilization of explainable AI (XAI) methods to explain the outcome of machine learning models. This is achieved by generating the corresponding heatmaps for the respective decision by the model. In the assessment of the quality of the model, heatmaps visualization involves two approaches: (1) one that uses the sole best model to create the heatmap and (2) one that makes use of an ensemble of different models with identical architecture but built on different data folds. The second approach allows for the assessment of the uncertainty levels per pixel, according to the standard deviation, and hence evaluation of the robustness of the heatmap, followed by the enhancement of the reliability of the structure's predictions.

[6] The authors are proposing a different technique for pneumonia detection in chest X-ray reports that combines Mask R-CNN with RetinaNet in an ensemble framework. Pneumonia is usually identified based on certain features, such as its corresponding height and ratio, typical of a height of 303(approx.) pixels (29.5% of the image height) and a width of 218(approx.) pixels (20.9% of the image width), making it a complex task to be detected on account of its small surface area. Here, the authors take a backbone approach for both models through the feature pyramid networks (FPN), which provide an improved multi-scale feature map quality compared to the classical pyramids. FPN combines low-resolution, deep characteristics with excellent quality, semantically strong features via top-down routes and lateral linkages, as seen in Figure 1. To modify things further, the authors also utilize residual networks (ResNet) as the base spine model instead of simple stacked convolutional layers that reduce degradation problems and allow building deeper and more accurate models. This combination improves the models' detection and classification performance polygons for pneumonitis in chest X-ray reports.

III. BACKGROUND

In the recent decade, machine learning (ML) algorithms have proven to be of incredible interest. for researchers who fully exploited the computational power of contemporary systems with the pre-specified algorithmic stages for image processing tasks. However, traditional machine learning techniques required manual feature design or images are segmented manually using output layers. LeCun et al. presented the Convolutional Neural Network (CNN) technique to address these specific difficulties. Features were automatically extracted through successive stacking of the layers, such that the model learned the underlying patterns without any predefined classes for the input images. While shallow networks concentrate mostly on the low-level features of images, CNNs become progressively more efficient as the number of layers increases; this is due to the fact that they may uncover higher-level features as the layers become deeper. CNNs utilize the back propagation technique to combine with all the analysis of the learned features for differentiating between different images and thus refine and update the model's parameters. Thus, the motivation behind CNN architecture is to apply convolution kernels to the input images or feature maps, new layers of feature maps are constructed, Pooling methods lower the number of pixels of attribute maps, and non-linear activation functions are applied to construct the output. To strengthen the model simulation, many integration strategies are in use: mid and high integration dominate the ranks. In such integrations, the input is partitioned by an integration layer within regions, each with various size subdivisions as governed in horizontal and vertical steps. The fundamental distinction between high and medium integrations is within the aggregated rate, where the lower regions average each subregion. Sigmoid and ReLU (Rectified Linear Units) are two common activation functions, are used to apply non-linearity to the model. Image features are automatically extracted in conjunction with segmentation, convolutive processes, integration functions, pooling layers, and other fully connected layers. With help from these feature derivations, the model can successfully standardize pneumonia in analyzed images. The model's performance is further reinforced by using image pixel level data, hence capturing fine details. In the past few decades, widely known models of deep learning, such as AlexNet and VGGNet, have arisen and greatly contributed to the formulation of deep learning models. However, due to excessive layers in these networks, such networks tend to focus on peculiar regions of the training images, thereby making themselves susceptible to overfitting and reduced generalization capacity. To address the challenge of introducing very deep networks into the architecture, the residual connections were introduced, offering an alternative that prevents performance degradation. This study examined the use of residual connections applied to a smaller CNN architecture, characterized by fewer layers, still having its performance and efficiency improved using the same mechanism.

IV. MATERIALS AND METHODS

A. Data

The third part of database, which the model performance was evaluated on, comprises 5,863 X-ray images which were collected from Kaggle. This dataset acts as a rich repository of images to assess the model's classifying and detecting accuracy of various conditions with standards to performance evaluation. convolutional functions, pooling layers, as well as other fully connected layers. In 2017, Dr. Paul Mooney initiated a Kaggle competition directed at the classification of viral and bacterial pneumonia. This dataset is distinguished from the rest by the fact that it contains 5,863 children’s X ray images which are great for studies involving children. What dataset are we referring to is the updated version, which is still useful for improving and testing the models intended for children pneumonia diagnosis.[6]

The dataset is segmented into three core sub-folders: training, testing-I and validation (val), where the distribution for category of images includes Pneumonia or General. Normal and Pneumonia Image samples shown in figure 1 have been set to a fixed size to facilitate uniformity during further processing. X-ray images have shown bad brightness levels owing to less exposure: Whenever patients are taken to take X-rays, it is familiar to get those films sub-optimal in terms of brightness. The images mainly include black, white, and grey colors. Anatomically speaking,1the lungs that lie on each side of the thoracic cavity are presented dark in their outlines due to their low density and conspicuous from X-ray. This means that the X-ray passes almost unchecked through the lungs and is well defined. The heart, which is positioned between the lungs, appears practically white since X-rays flow through it almost unhindered.Bones on the other hand are relatively dense proteins which prevent X-ray from penetrating through them resulting in white portions on the image with quasi-optimal opaque structure along their boundaries. This is the very reason that bones become quite prominent in the X-ray and help in making diagnosis easy.

Fig. 1.1. Images from dataset (Normal images)

Fig. 1.2. Images from dataset (Pneumonia cases)

B. Data Preprocessing

The methodologies employed during this research are summarized in Table 1.1. Rescaling involves image data normalization prior to the subsequent processing in our analyses. Because RGB values ranged from 0 to 255, this was too large for the models to efficiently process, especially with a typical learning rate. So, it was decided that pixel values should be scaled down by 1/255. The shear range is employed to facilitate the random application of shearing transformations. This option will distort the image slightly to simulate variations in angle. Zoom range: Random zooming into the images; this ensures that the model is more robust to variations in the object size. Finally, half of the images are randomly flipped around the horizontal axis, which serves to introduce real-world variability, that is, for any superficial chances that the images would have been mirrored in practice, which further augments the dataset for better model generalization.

Techniques used in this study: Data pre-processing

Resize	1./255
Zoom Scope	0.2
Shear Scope	0.2
Mirror Image	True

Table 1.0

C. Proposed Network

In this study, we created a convolutional neural network model - VGG-based model - that allows for feature extraction from chest X-ray pictures with the purpose of detecting pneumonia inside patients. The model architecture starts at 32 filters as its relatively low figure and increases as we go deeper into the layers.

The initial layer comprises Conv2D and Max Pooling layer afterwards to lessen the spatial dimensions. It is known that the best kernel size is 3x3. Other activation functions like Tanh can be used, but ReLU is preferred most of the time due to high training effectiveness. The input shape is structured considering the image’s width and height whilst considering that the last dimension is the number of color channels. After this, the model is first flattened and then the ANN layers are added to complete the structure.

f(x) = max(0,x)

S(x) = Sigmoid

f(x) = ReLU

For multi-class classification, the final layers employed a Softmax activation function with units proportional to the number of classes. For binary classification, a sigmoid activation function was applied, setting the unit to 1. This setup ensures the model outputs probabilities for each class or a binary outcome.

Fig 2. Details of proposed DL model.

References

[1] Sukhendra Singh , Manoj Kumar, Shitharth Selvarajan al., “Efficient pneumonia detection using Vision Transformers on Chest X-rays”, Computer Methods and Programs in Biomedicine, Elsevier,Janauary 2024. [2] Rabiul Hasan , Shah Muhammad ,Sheikh Md. al., “Recent Advancement of Deep learning techniques for pneumonia prediction from chest X-ray image”, Science Direct, October 2024 [3] Hongen Lu et al., “A Deep Learning based model for the Pneumonia from Chest X-Ray Images using VGG-16 and Neural Networks”, Science Direct ,2023. [4] Sammy V. Militante et al., “Pneumonia and COVID-19 Detection using Convolutional Neural Networks”, 2020 the third International on Vocational Education and Electrical Engineering , IEEE, 2021 [5] Nanette V. Dionisio et al., “Pneumonia Detection through Adaptive Deep Learning Models of Convolutional Neural Networks”, 2020 11th IEEE Control and System Graduate Research Colloquium (ICSGRC 2020), 8 August 2020 [6] Md. Jahid Hasan et al., “Deep Learning based Detection and Segmentation of COVID-19 & Pneumonia on Chest X-ray Image”, 2021 International Information and Communication Technology for Sustainable Development (ICICT4SD), 27-28 February 2021 [7] Hongen Lu et al., “Transfer Learning from Pneumonia to COVID-19”, Asia-Pacific on Computer Science and Data Engineering (CSDE), 2020 IEEE [8] Dagur, A., Singh, K., Mehra, P. S., & Shukla, D. K. (Eds.). (2023). Artificial Intelligence, Blockchain, Computing and Security Volume 1: Proceedings of the International Conference on Artificial Intelligence, Blockchain, Computing and Security (ICABCS 2023), Gr. Noida, UP, India, 24-25 February 2023. CRC Press. [9] Md. Jahid Hasan et al., “Deep Learning-based Detection and Segmentation of COVID-19 & Pneumonia on Chest X-ray Image”, 2021 International Information and Communication Technology for Sustainable Development (ICICT4SD), 27-28 February 2021 [10] https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia [11] L. Wang and A. Wong, \"COVID-Net: A tailored deep convolutional neural network design for detection of COVID-19 cases from chest radiography images,\" arXiv:2003.09871, 2020.

Copyright

Copyright © 2025 Ms. Puja Thakur, Kuldeep , Pranav Singh Soam, Pratibha Rawat, Ridam Agarwal. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET66940

Publish Date : 2025-02-12

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here