. The need for data privacy and security – enforced through increasingly strict data protection regulations – renders the use of healthcare data for machine learning difficult. In particular, the transfer of data between different hospitals is often not permissible and thus cross-site pooling of data not an option. The Personal Health Train (PHT) paradigm proposed within the GO-FAIR initiative implements an ’algorithm to the data’ paradigm that ensures that distributed data can be accessed for analysis without transferring any sensitive data. We present PHT-meDIC, a productively deployed open-source implementation of the PHT concept. Containerization allows us to easily deploy even complex data analysis pipelines (e.g, genomics, image analysis) across multiple sites in a secure and scalable manner. We discuss the underlying technological concepts, security models, and governance processes. The implementation has been successfully applied to distributed analyses of large-scale data, including applications of deep neural networks to medical image data.
Bringing the Algorithms to the Data - Secure Distributed Medical Analytics using the Personal Health Train (PHT-meDIC)
Marius Herr,Michael Graf,Peter Placzek,Florian König,Felix Bötte,Tyra Stickel,D. Hieber,Lukas Zimmermann,Michael Slupina,C. Mohr,Stephanie Biergans,Mete Akgün,Nícolas Pfeifer,O. Kohlbacher
Published 2022 in arXiv.org
ABSTRACT
PUBLICATION RECORD
- Publication year
2022
- Venue
arXiv.org
- Publication date
2022-12-07
- Fields of study
Medicine, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-49 of 49 references · Page 1 of 1
CITED BY
Showing 1-6 of 6 citing papers · Page 1 of 1