Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Arjun Nair, Abhishek Kumar Sharma, Kuhan Kumar, Ruchitha S., Purva , Prof. Banupriya G.
DOI Link: https://doi.org/10.22214/ijraset.2025.66865
Certificate: View Certificate
The healthcare sector is experiencing a surge in diverse health data, encompassing medical imaging, electronic health records and live sensor readings from wearable technology. Integrating these multi-modal datasets holds immense potential for improving medical care by facilitating better diagnostic accuracy, customized therapeutic approaches, and more comprehensive understanding of how diseases evolve. However, centralizing this sensitive patient data across various institutions raises significant privacy concerns and raises complex issues around data stewardship and administrative oversight.Federated learning has surfaced as a potential approach to harness the wealth of available data while safeguarding patient privacy. Federated Learning (FL) facilitates a collaborative approach to model training among various healthcare institutions, allowing them to work together without needing to exchange their raw data. This research presents an innovative FL framework specifically designed to integrate multi-modal health data. Our method tackles the issues of data variability and model integration in federated environments, with the goal of improving diagnostic precision and personalizing treatment suggestions, all while maintaining the confidentiality of patient data.
I. INTRODUCTION
A. Background
The field of healthcare is going through a fundamental transformation fueled by the exponential growth of digital health data. Electronic health records (EHRs) capture detailed patient information, medical imaging modalities like CT scans and X-rays provide visual understanding of disease states, and wearable sensor data offers real-time physiological measurements. Integrating these diverse data types, known as multi-modal data, offers a holistic perspective of patient health which enables:
However, realizing the full capabilities of multi-modal health data comes with significant challenges:
B. Federated Learning Overview
Federated Learning provides a groundbreaking method for training machine learning models collaboratively across several institutions without the need for them to share their raw data.
Here's how FL works:
C. Challenges and Research Focus
Integrating multi-modal data in a federated learning environment presents unique challenges:
II. LITERATURE REVIEW
A. Previous Research
Federated learning has gained considerable attention in healthcare due to its capacity to facilitate cooperative research while preserving data privacy. This approach allows multiple partners, such as hospitals and research institutions, to train machine learning models on larger, previously inaccessible datasets without centralizing or sharing sensitive patient information. This collaborative learning paradigm has the potential to enhance the predictive power of AI algorithms and accelerate advancements in healthcare.
Multi-modal data integration has become a key area of focus in healthcare, driven by the recognition that combining diverse data sources can provide a more holistic and informative perspective of patients’ health. By combining data from various modalities, such as genomics, imaging, and clinical records, can result in more precise diagnoses, personalized treatment plans, and a greater understanding of disease mechanisms.
Privacy-protecting techniques are essential for ensuring the responsible use of sensitive health data in federated learning. Differential privacy, homomorphic encryption, secure multi-party computation (MPC) and homomorphic encryption are key techniques that can be integrated into federated learning frameworks to enhance privacy protection.. Differential privacy adds noise to the data or model updates to reduce the effect of any individual data point, safeguarding against the identification of specific individuals from the aggregated results.
Secure MPC enables collaborative computation without revealing individual inputs, further enhancing privacy in federated learning scenarios. Homomorphic encryption allows encrypted data to undergo computation without requiring decryption, safeguarding data confidentiality during model training and aggregation
B. Theoretical Framework
This section outlines the theoretical principles behind federated learning and multi-modal data integration, focusing on key algorithms and fusion strategies.
1) Federated Learning Algorithms
Algorithm |
Key Features |
Advantages |
Limitations |
FedAvg |
Local SGD with model averaging |
Simple, communication-efficient |
Convergence challenges with non-IID data |
FedProx |
Proximal term for local updates |
Improved stability and convergence with heterogeneous data |
Requires tuning of the proximal term parameter |
FedOpt |
Adaptive optimization parameters |
Addresses tuning difficulties and convergence behavior |
Increased complexity compared to FedAvg |
Table 1
2) Multi-Modal Data Fusion Strategies
Multi-modal data integration entails integrating data from various modalities, like images, text, and sensor data. Several fusion strategies can be employed:
III. METHODOLOGY
A. Data Collection
This study leverages a multi-modal dataset comprising three distinct modalities:
B. Data Preprocessing and Feature Engineering
To prepare the multi-modal data for federated learning, a comprehensive preprocessing and feature engineering pipeline is employed
1) Data Cleaning
2) Feature Engineering
C. FL Framework
A federated learning framework is employed to develop machine learning models collaboratively across multiple clients while ensuring data privacy:
D. Evaluation Metrics
A thorough set of evaluation metrics is used to assess the effectiveness of the federated learning framework:
IV. RESULTS
A. Experimental Setup
To assess the effectiveness of our proposed federated multi-modal learning framework, we conducted experiments using the synthetic patient data and the code provided. The setup involved the following:
B. Federated Learning Performance
The federated learning process showed promising results:
Round |
Avg. Loss |
Avg. Accuracy |
1 |
14.9753 |
86.21% |
2 |
12.9345 |
90.05% |
3 |
12.2036 |
90.98% |
4 |
11.8824 |
91.00% |
5 |
11.7664 |
91.00% |
Table 2
C. Local Model Performance
The local models also demonstrated good performance, with their accuracies generally increasing over the rounds. The local accuracies for each round are shown in the Federated Learning Summary Report below.
D. Confusion Matrix
Below is the confusion matrix for the global model on the training data:
Figure 1
The confusion matrix offers a comprehensive overview of the model's predictions. This data helps evaluate the model's capability across various classes and highlights areas where improvements may be needed.
E. Federated Learning Summary Report
The following report summarizes the key findings of the federated learning experiment:
Federated Learning Summary Report:
Global Accuracies: [0.8620920278223648, 0.900526128054218, 0.9097824148385948, 0.91004102015338, 0.9100142678794366]
Figure 2
Global Losses: [14.975343512743711, 12.934522633006177, 12.203626012222633, 11.882384086317487, 11.766422974566618]
Figure 3
Local Accuracies:, [0.857597645799893, 0.8997592295345105, 0.9098715890850724, 0.9100856072766188, 0.9100588550026755], [0.8678170144462278, 0.898501872659176, 0.909630818619583, 0.9100053504547888, 0.9099785981808456]]
Final Global Accuracy: 0.9100142678794366
Final Global Loss: 11.766422974566618
V. DISCUSSION
The outcomes of our federated learning experiment showcase the viability and efficiency of training a multi-modal model on distributed healthcare data while maintaining privacy. The observed convergence and performance improvement indicate that the federated averaging process successfully combines the knowledge learned by the clients from their local datasets.
The final global accuracy of 91.00% is comparable to what might be achieved with centralized training, where all data is aggregated in one location. However, the federated approach has the crucial advantage of preserving data privacy, as no raw patient data is shared between institutions.
The use of a synthetic dataset with controlled accuracy increase allows for a clear demonstration of the federated learning process and its ability to improve model performance over rounds. In a real-world scenario with more complex and heterogeneous data, the performance gains might be even more significant, as suggested by research on multimodal federated systems.
The confusion matrix offers key insights into the model's performance. It shows that the model performs well in identifying both positive and negative cases, featuring a relatively small number of false positives and false negatives.
The local models also exhibit good performance, indicating that each client benefits from participating in the federated learning process. The local accuracies generally increase over the rounds, suggesting that the global model effectively captures and disseminates knowledge across the clients.
VI. FUTURE WORK
This research can be extended in several directions:
This study introduces an innovative federated learning framework for integrating multi-modal health data, tackling challenges like data heterogeneity and privacy protection. By facilitating collaborative model training without the need to share raw data, our approach aims to improve diagnostic accuracy, tailor treatment recommendations, and advance healthcare while maintaining patient data confidentiality. Future research will concentrate on refining the framework, testing it with real-world datasets, and enhancing its privacy-preserving features. The framework shows potential for improving patient care, speeding up medical research, and fully leveraging multi-modal health data, all while keeping high standards of data privacy and security.
[1] Rieke, N.; Hancox, J.; Li, W.; Milletarì, F.; Roth, H.R.; Albarqouni, S.; Bakas, S.; Galtier, M.N.; Landman, B.A.; Maier-Hein, K.; et al. The future of digital health with federated learning. npj Digit. Med. 2020. [2] Yang, Q.; Liu, Y.; Chen, T.; Tong, Y. Federated machine learning: Concept and applications. ACM Trans. Intell. Syst. Technol. 2019. [3] Kairouz, P.; McMahan, H.B.; Avent, B.; Bellet, A.; Bennis, M.; Bhagoji, A.N.; Bonawitz, K.; Charles, Z.; Cormode, G.; Cummings, R.; et al. Advances and open problems in federated learning. Found. Trends Mach. Learn. *2021. [4] Baltrušaitis, T.; Ahuja, C.; Morency, L.-P. Multimodal machine learning: A survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 2018. [5] Holzinger, A.; Malle, B.; Kieseberg, P.; Roth, P.M.; Müller, H.; Reihs, R.; Zatloukal, K. Towards the augmented pathologist: Challenges of explainable-AI in digital pathology. arXiv 2017. [6] Li, T.; Sahu, A.K.; Zaheer, M.; Sanjabi, M.; Talwalkar, A.; Smith, V. Federated optimization in heterogeneous networks. Proc. Mach. Learn. Res. 2020. [7] Abadi, M.; Chu, A.; Goodfellow, I.; McMahan, H.B.; Mironov, I.; Talwar, K.; Zhang, L. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 24–28 October 2016. [8] Gentry, C. Fully homomorphic encryption using ideal lattices. In Proceedings of the Forty-First Annual ACM Symposium on Theory of Computing, Bethesda, MD, USA, 31 May–2 June 2009. [9] Dwork, C. Differential privacy: A survey of results. In International Conference on Theory and Applications of Models of Computation; Springer: Berlin/Heidelberg, Germany, 2008. [10] Yao, A.C. Protocols for secure computations. In 23rd Annual Symposium on Foundations of Computer Science (SFCS 1982); IEEE: New York, NY, USA, 1982. [11] McMahan, H.B.; Moore, E.; Ramage, D.; Hampson, S.; Arcas, B.A.y. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics; PMLR: Fort Lauderdale, FL, USA, 2017. [12] Li, T.; Sahu, A.K.; Zaheer, M.; Sanjabi, M.; Talwalkar, A.; Smith, V. Federated optimization in heterogeneous networks. arXiv 2018. [13] Reddi, S.J.; Charles, Z.; Zaheer, M.; Garrett, Z.; Rush, K.; Kone?n?, J.; Kumar, S.; McMahan, H.B. Adaptive federated optimization. In Proceedings of the 9th International Conference on Learning Representations (ICLR), Virtual Event, 2021. [14] Bonawitz, K.; Eichner, H.; Grieskamp, W.; Huba, D.; Ingerman, A.; Ivanov, V.; Kiddon, C.; Kone?n?, J.; Mazzocchi, S.; McMahan, H.B.; et al. Towards federated learning at scale: System design. arXiv 2019. [15] Karimireddy, S.P.; Kale, S.; Mohri, M.; Reddi, S.J.; Stich, S.U.; Suresh, A.T. SCAFFOLD: Stochastic controlled averaging for on-device federated learning. In International Conference on Machine Learning; PMLR: Baltimore, MD, USA, 2020. [16] Li, X.; Huang, K.; Yang, W.; Wang, S.; Zhang, Z. On the convergence of FedAvg on non-IID data. In Proceedings of the 8th International Conference on Learning Representations (ICLR), Addis Ababa, Ethiopia, 26–30 April 2020. [17] Malinovsky, Y.; Kovalev, D.; Gasanov, E.; Condat, L.; Richtárik, P. From local SGD to local fixed-point methods for federated learning. In International Conference on Machine Learning; PMLR: Baltimore, MD, USA, 2020. [18] Truex, S.; Baracaldo, N.; Anwar, A.; Steinke, T.; Ludwig, H.; Zhang, R.; Zhou, Y. A hybrid approach to privacy-preserving federated learning. In Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security, Virtual Event, 2019. [19] Atrey, P.K.; Hossain, M.A.; El Saddik, A.; Kankanhalli, M.S. Multimodal fusion for multimedia analysis: A survey. Multimed. Syst. 2010. [20] Snoek, C.G.M.; Worring, M.; Smeulders, A.W.M. Early versus late fusion in semantic video analysis. In Proceedings of the 13th Annual ACM International Conference on Multimedia, Singapore, 6–11 November 2005. [21] Sui, J.; Adali, T.; Yu, Q.; Calhoun, V.D. A review of multivariate methods for multimodal fusion of brain imaging data. J. Neurosci. Methods *2012. [22] Sui, J.; Pearlson, G.D.; Adali, T.; Calhoun, V.D. An ICA-based method for the identification of optimal FMRI features and components using combined FMRI and SNP data. Neuroimage *2014. [23] Wang, Z.; Nie, F.; Huang, H.; Risacher, S.L.; Saykin, A.J.; Shen, L. Identifying disease sensitive and quantitative trait-relevant biomarkers from multidimensional heterogeneous imaging genetics data via sparse multimodal multitask learning. Bioinformatics *2012. [24] Zhang, D.; Wang, Y.; Zhou, L.; Yuan, H.; Shen, D. Multimodal classification of Alzheimer’s disease and mild cognitive impairment. Neuroimage *2011. [25] McMahan, H.B.; Ramage, D.; Talwar, K.; Zhang, L. Learning differentially private recurrent language models. In Proceedings of the 6th International Conference on Learning Representations (ICLR), Vancouver, BC, Canada, 30 April–3 May 2018. [26] Wei, K.; Li, J.; Ding, M.; Ma, C.; Yang, H.H.; Farokhi, F.; Jin, S.; Quek, T.Q.S.; Poor, H.V. Federated learning with differential privacy: Algorithms and performance analysis. IEEE Trans. Inf. Forensics Secur. 2020. [27] Geyer, R.C.; Klein, T.; Nabi, M. Differentially private federated learning: A client level perspective. 2017. [28] Hardy, S.; Henecka, W.; Ivey-Law, H.; Nock, R.; Patrini, G.; Smith, G.; Thorne, B. Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. 2017. [29] Phong, L.T.; Aono, Y.; Hayashi, T.; Wang, L.; Moriai, S. Privacy-preserving deep learning via additively homomorphic encryption. IEEE Trans. Inf. Forensics Secur. 2018.
Copyright © 2025 Arjun Nair, Abhishek Kumar Sharma, Kuhan Kumar, Ruchitha S., Purva , Prof. Banupriya G.. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET66865
Publish Date : 2025-02-07
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here