Welcome to the Biomedical Informatics Lab of Prof. Dr. Gunnar Rätsch

The research in our group lies at the interface between methods research in Machine Learning, Genomics and Medical Informatics and relevant applications in biology and medicine.

We develop new analysis techniques that are capable of dealing with large amounts of medical and genomic data. These techniques aim to provide accurate predictions on the phenomenon at hand and to comprehensibly provide reasons for their prognoses, and thereby assist in gaining new biomedical insights.

Current research includes a) Machine Learning related to time-series analysis and iterative optimization algorithms, b) methods for transcriptome analyses to study transcriptome alterations in cancer, c) developing clinical decision support systems, in particular, for time series data from intensive care units, d) new graph genome algorithms to store and analyze very large sets of genomic sequences, and e) developing methods and resources for international sharing of genomic and clinical data, for instance, about variants in BRCA1/2.

Abstract Intensive care units (ICU) are increasingly looking towards machine learning for methods to provide online monitoring of critically ill patients. In machine learning, online monitoring is often formulated as a supervised learning problem. Recently, contrastive learning approaches have demonstrated promising improvements over competitive supervised benchmarks. These methods rely on well-understood data augmentation techniques developed for image data which do not apply to online monitoring. In this work, we overcome this limitation by supplementing time-series data augmentation techniques with a novel contrastive learning objective which we call neighborhood contrastive learning (NCL). Our objective explicitly groups together contiguous time segments from each patient while maintaining state-specific information. Our experiments demonstrate a marked improvement over existing work applying contrastive methods to medical time-series.

Authors Hugo Yèche, Gideon Dresdner, Francesco Locatello, Matthias Hüser, Gunnar Rätsch

Submitted ICML 2021


Abstract Generating interpretable visualizations of multivariate time series in the intensive care unit is of great practical importance. Clinicians seek to condense complex clinical observations into intuitively understandable critical illness patterns, like failures of different organ systems. They would greatly benefit from a low-dimensional representation in which the trajectories of the patients' pathology become apparent and relevant health features are highlighted. To this end, we propose to use the latent topological structure of Self-Organizing Maps (SOMs) to achieve an interpretable latent representation of ICU time series and combine it with recent advances in deep clustering. Specifically, we (a) present a novel way to fit SOMs with probabilistic cluster assignments (PSOM), (b) propose a new deep architecture for probabilistic clustering (DPSOM) using a VAE, and (c) extend our architecture to cluster and forecast clinical states in time series (T-DPSOM). We show that our model achieves superior clustering performance compared to state-of-the-art SOM-based clustering methods while maintaining the favorable visualization properties of SOMs. On the eICU data-set, we demonstrate that T-DPSOM provides interpretable visualizations of patient state trajectories and uncertainty estimation. We show that our method rediscovers well-known clinical patient characteristics, such as a dynamic variant of the Acute Physiology And Chronic Health Evaluation (APACHE) score. Moreover, we illustrate how it can disentangle individual organ dysfunctions on disjoint regions of the two-dimensional SOM map.

Authors Laura Manduchi, Matthias Hüser, Martin Faltys, Julia Vogt, Gunnar Rätsch, Vincent Fortuin

Submitted ACM-CHIL 2021


Abstract Dynamic assessment of mortality risk in the intensive care unit (ICU) can be used to stratify patients, inform about treatment effectiveness or serve as part of an early-warning system. Static risk scoring systems, such as APACHE or SAPS, have recently been supplemented with data-driven approaches that track the dynamic mortality risk over time. Recent works have focused on enhancing the information delivered to clinicians even further by producing full survival distributions instead of point predictions or fixed horizon risks. In this work, we propose a non-parametric ensemble model, Weighted Resolution Survival Ensemble (WRSE), tailored to estimate such dynamic individual survival distributions. Inspired by the simplicity and robustness of ensemble methods, the proposed approach combines a set of binary classifiers spaced according to a decay function reflecting the relevance of short-term mortality predictions. Models and baselines are evaluated under weighted calibration and discrimination metrics for individual survival distributions which closely reflect the utility of a model in ICU practice. We show competitive results with state-of-the-art probabilistic models, while greatly reducing training time by factors of 2-9x.

Authors Jonathan Heitz, Joanna Ficek, Martin Faltys, Tobias M. Merz, Gunnar Rätsch, Matthias Hüser

Submitted Proceedings of the AAAI-2021 - Spring Symposium on Survival Prediction


Abstract Learning object-centric representations of complex scenes is a promising step towards enabling efficient abstract reasoning from low-level perceptual features. Yet, most deep learning approaches learn distributed representations that do not capture the compositional properties of natural scenes. In this paper, we present the Slot Attention module, an architectural component that interfaces with perceptual representations such as the output of a convolutional neural network and produces a set of task-dependent abstract representations which we call slots. These slots are exchangeable and can bind to any object in the input by specializing through a competitive procedure over multiple rounds of attention. We empirically demonstrate that Slot Attention can extract object-centric representations that enable generalization to unseen compositions when trained on unsupervised object discovery and supervised property prediction tasks.

Authors Francesco Locatello, Dirk Weissenborn, Thomas Unterthiner, Aravindh Mahendran, Georg Heigold, Jakob Uszkoreit, Alexey Dosovitskiy, Thomas Kipf

Submitted NeurIPS 2020 (spotlight)


Abstract We call upon the research community to standardize efforts to use daily self-reported data about COVID-19 symptoms in the response to the pandemic and to form a collaborative consortium to maximize global gain while protecting participant privacy. The rapid and global spread of COVID-19 led the World Health Organization to declare it a pandemic on 11 March 2020. One factor contributing to the spread of the pandemic is the lack of information about who is infected, in large part because of the lack of testing. This facilitated the silent spread of the causative coronavirus (SARS-CoV-2), which led to delays in public-health and government responses and an explosion in cases. In countries that have tested more aggressively and that had the capacity to transparently share this data, such as South Korea and Singapore, the spread of disease has been greatly slowed1. Although efforts are underway around the world to substantially ramp up testing capacity, technology-driven approaches to collecting self-reported information can fill an immediate need and complement official diagnostic results. This type of approach has been used for tracking other diseases, notably influenza2. The information collected may include health status that is self-reported through surveys, including those from mobile apps; results of diagnostic laboratory tests; and other static and real-time geospatial data. The collection of privacy-protected information from volunteers about health status over time may enable researchers to leverage these data to predict, respond to and learn about the spread of COVID-19. Given the global nature of the disease, we aim to form an international consortium, tentatively named the ‘Coronavirus Census Collective’, to serve as a hub for amassing this type of data and to create a unified platform for global epidemiological data collection and analysis.

Authors Segal E, Zhang F, Lin X, King G, Shalem O, Shilo S, Allen WE, Alquaddoomi F, Altae-Tran H, Anders S, Balicer R, Bauman T, Bonilla X, Booman G, Chan AT, Cohen O, Coletti S, Natalie R Davidson, Dor Y, Drew DA, Elemento O, Evans G, Ewels P, Gale J, Gavrieli A, Geiger B, Grad YH, Greene CS, Hajirasouliha I, Jerala R, Kahles A, Kallioniemi O, Keshet A, Kocarev L, Landua G, Meir T, Muller A, Nguyen LH, Oresic M, Ovchinnikova S, Peterson H, Prodanova J, Rajagopal J, Rätsch G, Rossman H, Rung J, Sboner A, Sigaras A, Spector T, Steinherz R, Stevens I, Vilo J, Wilmes P.

Submitted Nature Medicine

Link DOI