Fedor Sergeev, MSc
"With four parameters I can fit an elephant, and with five I can make him wiggle his trunk." - John von Neumann
PhD Student
- fedor.sergeev@ inf.ethz.ch
- Address
-
Department of Computer Science
Biomedical Informatics Group
Universitätstrasse 6
8092 Zürich - Room
- CAB F53
Informing machine learning models with human insights for better performance, reliability, and interpretability
I did my BSc in Applied Mathematics and Physics at MIPT. I developed computational methods for simulations in physics under the supervision of Igor Petrov and Nikolay Khoklov. In parallel, I worked on applications of deep learning in high-energy physics at GSI and LAMBDA.
In my MSc I studied Computational Sciences and Engineering EPFL, combining my interest in numerical and data-driven modeling. My thesis with Pascal Fua and Jonathan Donier was on physics-informed neural networks for modeling fluid flow. During my studies, I interned at startup companies Spiden and Neural Concept, working on synthetic data generation for medical spectroscopic data and 3D computer vision for advanced engineering, respectively.
I joined BMI lab in July 2023 to work on multimodal, representation and Bayesian deep learning on intensive care unit (ICU) data. I am also an ELLIS PhD student, co-supervised by Vincent Fortuin.
Latest Publications
Abstract Notable progress has been made in generalist medical large language models across various healthcare areas. However, large-scale modeling of in-hospital time series data - such as vital signs, lab results, and treatments in critical care - remains underexplored. Existing datasets are relatively small, but combining them can enhance patient diversity and improve model robustness. To effectively utilize these combined datasets for large-scale modeling, it is essential to address the distribution shifts caused by varying treatment policies, necessitating the harmonization of treatment variables across the different datasets. This work aims to establish a foundation for training large-scale multi-variate time series models on critical care data and to provide a benchmark for machine learning models in transfer learning across hospitals to study and address distribution shift challenges. We introduce a harmonized dataset for sequence modeling and transfer learning research, representing the first large-scale collection to include core treatment variables. Future plans involve expanding this dataset to support further advancements in transfer learning and the development of scalable, generalizable models for critical healthcare applications.
Authors Manuel Burger, Fedor Sergeev, Malte Londschien, Daphné Chopard, Hugo Yèche, Eike Gerdes, Polina Leshetkina, Alexander Morgenroth, Zeynep Babür, Jasmina Bogojeska, Martin Faltys, Rita Kuznetsova, Gunnar Rätsch
Submitted Best Paper @ NeurIPS AIM-FM Workshop 2024
Abstract Knowing which features of a multivariate time series to measure and when is a key task in medicine, wearables, and robotics. Better acquisition policies can reduce costs while maintaining or even improving the performance of downstream predictors. Inspired by the maximization of conditional mutual information, we propose an approach to train acquirers end-to-end using only the downstream loss. We show that our method outperforms random acquisition policy, matches a model with an unrestrained budget, but does not yet overtake a static acquisition strategy. We highlight the assumptions and outline avenues for future work.
Authors Fedor Sergeev, Paola Malsot, Gunnar Rätsch, Vincent Fortuin
Submitted SPIGM ICML Workshop