Malte Londschien, Msc

PhD Student

E-Mail
malte.londschien@get-your-addresses-elsewhere.ai.ethz.ch
twitter
@mlondschien

My research interests are in Causality and Digital Health. I aim to use ideas from causality to improve the robustness of machine learning models.

I am a Doctoral Fellow at the ETH AI Center, jointly supervised by Peter Bühlmann and Gunnar Rätsch. Prior to my PhD, I worked at QuantCo as a Data Scientist and Engineer and, during my studies, for Novartis as a Statistical Scientist. I studied Mathematics at ETH, where I received the ETH Master and Opportunity Award from the ETH Foundation, awarded based on talent and performance, and the Willi Studer Price, for graduating top of my class. Please see my personal website for further information.

Abstract Notable progress has been made in generalist medical large language models across various healthcare areas. However, large-scale modeling of in-hospital time series data - such as vital signs, lab results, and treatments in critical care - remains underexplored. Existing datasets are relatively small, but combining them can enhance patient diversity and improve model robustness. To effectively utilize these combined datasets for large-scale modeling, it is essential to address the distribution shifts caused by varying treatment policies, necessitating the harmonization of treatment variables across the different datasets. This work aims to establish a foundation for training large-scale multi-variate time series models on critical care data and to provide a benchmark for machine learning models in transfer learning across hospitals to study and address distribution shift challenges. We introduce a harmonized dataset for sequence modeling and transfer learning research, representing the first large-scale collection to include core treatment variables. Future plans involve expanding this dataset to support further advancements in transfer learning and the development of scalable, generalizable models for critical healthcare applications.

Authors Manuel Burger, Fedor Sergeev, Malte Londschien, Daphné Chopard, Hugo Yèche, Eike Gerdes, Polina Leshetkina, Alexander Morgenroth, Zeynep Babür, Jasmina Bogojeska, Martin Faltys, Rita Kuznetsova, Gunnar Rätsch

Submitted AIM-FM Workshop at NeurIPS 2024

Link DOI

Abstract We propose a novel multivariate nonparametric multiple change point detection method using classifiers. We construct a classifier log-likelihood ratio that uses class probability predictions to compare different change point configurations. We propose a computationally feasible search method that is particularly well suited for random forests, denoted by changeforest. However, the method can be paired with any classifier that yields class probability predictions, which we illustrate by also using a k-nearest neighbor classifier. We provide theoretical results motivating our choices. In a large simulation study, our proposed changeforest method achieves improved empirical performance compared to existing multivariate nonparametric change point detection methods. An efficient implementation of our method is made available for R, Python, and Rust users in the changeforest software package.

Authors Malte Londschien, Peter Bühlmann, and Solt Kovács

Submitted arXiv preprints

Link

Abstract We propose estimation methods for change points in high-dimensional covariance structures with an emphasis on challenging scenarios with missing values. We advocate three imputation like methods and investigate their implications on common losses used for change-point detection. We also discuss how model selection methods have to be adapted to the setting of incomplete data. The methods are compared in a simulation study and applied to a time series from an environmental monitoring system.

Authors Malte Londschien, Solt Kovács and Peter Bühlmann

Submitted Journal of Computational and Graphical Statistics

Link