Respiratory diseases are on the rise around the world and the impact of COVID-19 has highlighted the need for early and better diagnosis of respiratory diseases. The growing air pollution from vehicles, wildfires, coal-burning power plants and other natural and non-natural causes, particularly in the developing world is leading to more deaths due to respiratory problems leaving many in need of diagnosis and treatment. The use of machine learning for preliminary analysis and diagnosis of diseases is evolving rapidly, particularly in the area of analysis of medical images to help sort through and analyze hundreds of X-ray, CT-Scan or MRI images to highlight the area affected by a potential medical condition and suggest a possible diagnosis. This has allowed timely detection of potentially life-threatening diseases, reduced diagnosis time, improved efficiency and better coverage. These medical aids are also being integrated into the medical scanning equipment and related software to further improve the diagnostic process.
This paper attempts to expand the use of deep learning as a method of assisting in the early diagnosis of respiratory diseases using audio recordings. The paper describes an approach for analyzing lung sounds captured using an electronic stethoscope from different parts of a patient’s chest wall. The paper describes the processes used to extract features from the sound signals, dataset preparation, neural network architectures evaluated and the prediction results.
Introduction
I. INTRODUCTION
Chronic respiratory diseases are one of the leading causes of morbidity and mortality worldwide. In many cases, such deaths are preventable with early diagnosis. Diseases like chronic obstructive pulmonary disease (COPD), asthma, bronchitis and pleural effusion can lead to a reduced quality of life, disability or even death. The increase in lung diseases in children under the age of 5, including infectious processes and chronic conditions such as asthma are among the most common causes of mortality affecting about 14% of children and rising. Among the most common causes of the rise in chronic respiratory conditions are degrading air quality due to pollution, viral infections, exposure to toxic work or living environments and primary or secondary tobacco smoke inhalation. Data from the WHO and Global Burden of Chronic Respiratory Diseases[1] (GBD) study show that nine out of 10 people are exposed to high levels of air pollutants with up to 3.2 million deaths due to COPD and 495,000 deaths due to asthma in a year.
Conclusion
With this paper. I have attempted to demonstrate that analog signals in particular the sound samples can be analyzed using modern deep learning architectures. Audio feature extraction techniques such as Short-Term Fourier Transform, Mel-Spectrogram and Mel-Frequency Cepstral Coefficients (MFCC) can be applied to represent and transform key features from audio samples. The features can then be analyzed and classified using deep convolutional neural networks to provide accurate predictions which can help medical professionals in their diagnosis processes. The ability of such models to sift through thousands of samples in a short time can allow for better availability and coverage of medical help needed by patients with respiratory diseases.
References
[1] Stephanie M. Levine, Darcy D. Marciniuk: Global Impact of Respiratory Disease (Global Impact of Respiratory Disease (chestnet.org))
[2] Mohammad Fraiwan, Luay Fraiwan, Basheer Khassawneh, Ali Ibnian: A dataset of lung sounds recorded from the chest wall using an electronic stethoscope (A dataset of lung sounds recorded from the chest wall using an electronic stethoscope - Mendeley Data)
[3] Prevalence and attributable health burden of chronic respiratory diseases, 1990–2017 : (Prevalence and attributable health burden of chronic respiratory diseases, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017 - The Lancet Respiratory Medicine)
[4] Haytham Fayek : Speech Processing for Machine Learning: Filter banks, Mel-Frequency Cepstral Coefficients (MFCCs) and What\'s In-Between (Speech Processing for Machine Learning: Filter banks, Mel-Frequency Cepstral Coefficients (MFCCs) and What’s In-Between | Haytham Fayek)
[5] Gao Huang, Zhuang Liu, Laurens van der Maaten and Kilian Q. Weinberger: Densely Connected Convolutional Networks (1608.06993.pdf (arxiv.org))
[6] Harrison Jansma : Don’t Use Dropout in Convolutional Networks (Don’t Use Dropout in Convolutional Networks - KDnuggets)
[7] Tao Zhou ,XinYu Ye ,HuiLing Lu ,Xiaomin Zheng, Shi Qiu,5 and YunCan Liu : Dense Convolutional Network and Its Application in Medical Image Analysis (Dense Convolutional Network and Its Application in Medical Image Analysis (hindawi.com))
[8] Aaliyah Ahmed: Architecture of DenseNet-121 (Architecture of DenseNet-121 (opengenus.org))
[9] Yugesh Verma : Addressing The Vanishing Gradient Problem (Addressing The Vanishing Gradient Problem (analyticsindiamag.com))
[10] Noha Radwan: Leveraging Sparse and Dense Features for Reliable State Estimation in Urban Environments ((PDF) Leveraging Sparse and Dense Features for Reliable State Estimation in Urban Environments (researchgate.net))
About the author
Kuljeet Singh, in a career spanning 20+ years, has worked extensively in the fields of Artificial Intelligence, Internet-of-Things, Multimedia and Embedded Systems domains. Presently serving as a Solutions Architect- IoT & AI with Wipro’s Engineering Edge (WEE) division, he has designed and deployed several large AI solutions for worker safety, logistics, utility and manufacturing. He is also working with the technology teams of leading customers and industry experts on the next generation of IoT and AI solutions. For more information, contact Kuljeet at kuljeet.singh@wipro.com