Malte Londschien, Msc

Alumni

E-Mail
malte.londschien@get-your-addresses-elsewhere.ai.ethz.ch
twitter
@mlondschien

My research interests are in Causality and Digital Health. I aim to use ideas from causality to improve the robustness of machine learning models.

I am a Doctoral Fellow at the ETH AI Center, jointly supervised by Peter Bühlmann and Gunnar Rätsch. Prior to my PhD, I worked at QuantCo as a Data Scientist and Engineer and, during my studies, for Novartis as a Statistical Scientist. I studied Mathematics at ETH, where I received the ETH Master and Opportunity Award from the ETH Foundation, awarded based on talent and performance, and the Willi Studer Price, for graduating top of my class. Please see my personal website for further information.

Abstract Intensive care departments generate vast multivariate time series data capturing the dynamic physiological states of critically ill patients. Despite advances in AI-driven clinical decision support, existing models remain limited. They are tailored to specific conditions or single institutions and require extensive adaptation for new settings. To make such generalization feasible, we introduce ICareFM, a novel foundation model for intensive care, trained on a harmonized dataset of unprecedented scale. The dataset contains 650,000 patient stays, accumulating more than 4,000 patient years of data, and over one billion measurements from hospitals in the US, several European countries, and China. ICareFM employs a novel self-supervised time-to-event objective that extracts robust patient representations from noisy, irregular, multivariate time series. As a result, ICareFM can generalize to new tasks and beyond its training distribution, a property we demonstrate through evaluations in a range of out-of-distribution scenarios, including transfer to unseen hospitals and zero-shot inference on previously unobserved tasks. ICareFM consistently outperforms conventional machine learning models and recent foundation model baselines, demonstrating strong generalization, improved data efficiency, and the ability to generate interpretable forecasts. These results establish ICareFM as a scalable and adaptable foundation model for critical care time series, enabling zero-shot clinical prediction and working towards the development of digital patient twins for precision medicine.

Authors Manuel Burger, Daphné Chopard, Malte Londschien, Fedor Sergeev, Hugo Yèche, Rita Kuznetsova, Martin Faltys, Eike Gerdes, Polina Leshetkina, Peter Bühlmann, Gunnar Rätsch

Submitted medRxiv

Link DOI

Abstract Notable progress has been made in generalist medical large language models across various healthcare areas. However, large-scale modeling of in-hospital time series data - such as vital signs, lab results, and treatments in critical care - remains underexplored. Existing datasets are relatively small, but combining them can enhance patient diversity and improve model robustness. To effectively utilize these combined datasets for large-scale modeling, it is essential to address the distribution shifts caused by varying treatment policies, necessitating the harmonization of treatment variables across the different datasets. This work aims to establish a foundation for training large-scale multi-variate time series models on critical care data and to provide a benchmark for machine learning models in transfer learning across hospitals to study and address distribution shift challenges. We introduce a harmonized dataset for sequence modeling and transfer learning research, representing the first large-scale collection to include core treatment variables. Future plans involve expanding this dataset to support further advancements in transfer learning and the development of scalable, generalizable models for critical healthcare applications.

Authors Manuel Burger, Fedor Sergeev, Malte Londschien, Daphné Chopard, Hugo Yèche, Eike Gerdes, Polina Leshetkina, Alexander Morgenroth, Zeynep Babür, Jasmina Bogojeska, Martin Faltys, Rita Kuznetsova, Gunnar Rätsch

Submitted Best Paper @ NeurIPS AIM-FM Workshop 2024

Link DOI

Abstract We propose a novel multivariate nonparametric multiple change point detection method using classifiers. We construct a classifier log-likelihood ratio that uses class probability predictions to compare different change point configurations. We propose a computationally feasible search method that is particularly well suited for random forests, denoted by changeforest. However, the method can be paired with any classifier that yields class probability predictions, which we illustrate by also using a k-nearest neighbor classifier. We provide theoretical results motivating our choices. In a large simulation study, our proposed changeforest method achieves improved empirical performance compared to existing multivariate nonparametric change point detection methods. An efficient implementation of our method is made available for R, Python, and Rust users in the changeforest software package.

Authors Malte Londschien, Peter Bühlmann, and Solt Kovács

Submitted arXiv preprints

Link

Abstract We propose estimation methods for change points in high-dimensional covariance structures with an emphasis on challenging scenarios with missing values. We advocate three imputation like methods and investigate their implications on common losses used for change-point detection. We also discuss how model selection methods have to be adapted to the setting of incomplete data. The methods are compared in a simulation study and applied to a time series from an environmental monitoring system.

Authors Malte Londschien, Solt Kovács and Peter Bühlmann

Submitted Journal of Computational and Graphical Statistics

Link