MERGE: A model for multi-input biomedical federated learning

Summary Driven by the deep learning (DL) revolution, artificial intelligence (AI) has become a fundamental tool for many biomedical tasks, including analyzing and classifying diagnostic images. Imaging, however, is not the only source of information. Tabular data, such as personal and genomic data and blood test results, are routinely collected but rarely considered in DL pipelines. Nevertheless, DL requires large datasets that often must be pooled from different institutions, raising non-trivial privacy concerns. Federated learning (FL) is a cooperative learning paradigm that aims to address these issues by moving models instead of data across different institutions. Here, we present a federated multi-input architecture using images and tabular data as a methodology to enhance model performance while preserving data privacy. We evaluated it on two showcases: the prognosis of COVID-19 and patients’ stratification in Alzheimer’s disease, providing evidence of enhanced accuracy and F1 scores against single-input models and improved generalizability against non-federated models. Highlights • This study demonstrates the advantages of a multi-input federated architecture • The proposed architecture leverages images and tabular data to improve classification • Improvements in classification include accuracy and F1 score • The approach has been tested on 2D (CoViD-CXR) and 3D (ADNI) datasets The bigger picture Deep learning models must be trained with large datasets, which often requires pooling data from different sites and sources. In research fields dealing with sensitive information subject to data regulations, such as biomedical research, data pooling can generate concerns about data access and sharing across institutions, which can affect performance, energy consumption, privacy, and security. Federated learning is a cooperative learning paradigm that addresses such concerns by sharing models instead of data across different institutions. This study proposes a multi-input federated learning (FL) approach to improve the performance of deep learning (DL) classification tasks while maintaining the inherent advantages of FL architectures. The authors first demonstrate how a DL model can handle input data of different nature simultaneously, such as images (2D and 3D) and tabular records. They then show how the federated multi-input model improves the performance and generalizability of non-federated models while preserving the security and data protection properties peculiar to FL.

All Keywords
【저자키워드】 federated learning, biomedical imaging, federated classification, mixed-data deep learning, multi-input classification,