Welcome to the Biomedical Informatics Lab of Prof. Dr. Gunnar Rätsch

The research in our group lies at the interface between methods research in Machine Learning, Genomics and Medical Informatics and relevant applications in biology and medicine.

We develop new analysis techniques that are capable of dealing with large amounts of medical and genomic data. These techniques aim to provide accurate predictions on the phenomenon at hand and to comprehensibly provide reasons for their prognoses, and thereby assist in gaining new biomedical insights.

Current research includes a) Machine Learning related to time-series analysis and iterative optimization algorithms, b) methods for transcriptome analyses to study transcriptome alterations in cancer, c) developing clinical decision support systems, in particular, for time series data from intensive care units, d) new graph genome algorithms to store and analyze very large sets of genomic sequences, and e) developing methods and resources for international sharing of genomic and clinical data, for instance, about variants in BRCA1/2.

Jonas Rothfuss, Vincent Fortuin, Andreas Krause PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees arXiv Preprints

Abstract Meta-learning can successfully acquire useful inductive biases from data, especially when a large number of meta-tasks are available. Yet, its generalization properties to unseen tasks are poorly understood. Particularly if the number of meta-tasks is small, this raises concerns for potential overfitting. We provide a theoretical analysis using the PAC-Bayesian framework and derive novel generalization bounds for meta-learning with unbounded loss functions and Bayesian base learners. Using these bounds, we develop a class of PAC-optimal meta-learning algorithms with performance guarantees and a principled meta-regularization. When instantiating our PAC-optimal hyper-posterior (PACOH) with Gaussian processes as base learners, the resulting approach consistently outperforms several popular meta-learning methods, both in terms of predictive accuracy and the quality of its uncertainty estimates.

Authors Jonas Rothfuss, Vincent Fortuin, Andreas Krause

Submitted arXiv Preprints

PCAWG Transcriptome Core Group, Claudia Calabrese, Natalie R. Davidson, Deniz Demircioğlu, Nuno A. Fonseca, Yao He, André Kahles, Kjong-Van Lehmann, Fenglin Liu, Yuichi Shiraishi, Cameron M. Soulette, Lara Urban, Liliana Greger, Siliang Li, Dongbing Liu, Marc D. Perry, Qian Xiang, Fan Zhang, Junjun Zhang, Peter Bailey, Serap Erkek, Katherine A. Hoadley, Yong Hou, Matthew R. Huska, Helena Kilpinen, Jan O. Korbel, Maximillian G. Marin, Julia Markowski, Tannistha Nandi, Qiang Pan-Hammarström, Chandra Sekhar Pedamallu, Reiner Siebert, Stefan G. Stark, Hong Su, Patrick Tan, Sebastian M. Waszak, Christina Yung, Shida Zhu, Philip Awadalla, Chad J. Creighton, Matthew Meyerson, B. F. Francis Ouellette, Kui Wu, Huanming Yang, PCAWG Transcriptome Working Group, Alvis Brazma, Angela N. Brooks, Jonathan Göke, Gunnar Rätsch, Roland F. Schwarz, Oliver Stegle, Zemin Zhang & PCAWG Consortium- Show fewer authors Nature volume 578, pages129–136(2020)Cite this article Genomic basis for RNA alterations in cancer Nature

Abstract Transcript alterations often result from somatic changes in cancer genomes. Various forms of RNA alterations have been described in cancer, including overexpression, altered splicing and gene fusions; however, it is difficult to attribute these to underlying genomic changes owing to heterogeneity among patients and tumour types, and the relatively small cohorts of patients for whom samples have been analysed by both transcriptome and whole-genome sequencing. Here we present, to our knowledge, the most comprehensive catalogue of cancer-associated gene alterations to date, obtained by characterizing tumour transcriptomes from 1,188 donors of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). Using matched whole-genome sequencing data, we associated several categories of RNA alterations with germline and somatic DNA alterations, and identified probable genetic mechanisms. Somatic copy-number alterations were the major drivers of variations in total gene and allele-specific expression. We identified 649 associations of somatic single-nucleotide variants with gene expression in cis, of which 68.4% involved associations with flanking non-coding regions of the gene. We found 1,900 splicing alterations associated with somatic mutations, including the formation of exons within introns in proximity to Alu elements. In addition, 82% of gene fusions were associated with structural variants, including 75 of a new class, termed ‘bridged’ fusions, in which a third genomic location bridges two genes. We observed transcriptomic alteration signatures that differ between cancer types and have associations with variations in DNA mutational signatures. This compendium of RNA alterations in the genomic context provides a rich resource for identifying genes and mechanisms that are functionally implicated in cancer.

Authors PCAWG Transcriptome Core Group, Claudia Calabrese, Natalie R. Davidson, Deniz Demircioğlu, Nuno A. Fonseca, Yao He, André Kahles, Kjong-Van Lehmann, Fenglin Liu, Yuichi Shiraishi, Cameron M. Soulette, Lara Urban, Liliana Greger, Siliang Li, Dongbing Liu, Marc D. Perry, Qian Xiang, Fan Zhang, Junjun Zhang, Peter Bailey, Serap Erkek, Katherine A. Hoadley, Yong Hou, Matthew R. Huska, Helena Kilpinen, Jan O. Korbel, Maximillian G. Marin, Julia Markowski, Tannistha Nandi, Qiang Pan-Hammarström, Chandra Sekhar Pedamallu, Reiner Siebert, Stefan G. Stark, Hong Su, Patrick Tan, Sebastian M. Waszak, Christina Yung, Shida Zhu, Philip Awadalla, Chad J. Creighton, Matthew Meyerson, B. F. Francis Ouellette, Kui Wu, Huanming Yang, PCAWG Transcriptome Working Group, Alvis Brazma, Angela N. Brooks, Jonathan Göke, Gunnar Rätsch, Roland F. Schwarz, Oliver Stegle, Zemin Zhang & PCAWG Consortium- Show fewer authors Nature volume 578, pages129–136(2020)Cite this article

Submitted Nature

Matthias Hüser, Adrian Kündig, Walter Karlen, Valeria De Luca, Martin Jaggi Forecasting intracranial hypertension using multi-scale waveform metrics Physiological Measurement

Abstract Objective: Acute intracranial hypertension is an important risk factor of secondary brain damage after traumatic brain injury. Hypertensive episodes are often diagnosed reactively, leading to late detection and lost time for intervention planning. A pro-active approach that predicts critical events several hours ahead of time could assist in directing attention to patients at risk. Approach: We developed a prediction framework that forecasts onsets of acute intracranial hypertension in the next 8 hours. It jointly uses cerebral auto-regulation indices, spectral energies and morphological pulse metrics to describe the neurological state of the patient. One-minute base windows were compressed by computing signal metrics, and then stored in a multi-scale history, from which physiological features were derived. Main results: Our model predicted events up to 8 hours in advance with alarm recall rates of 90% at a precision of 30% in the MIMIC- III waveform database, improving upon two baselines from the literature. We found that features derived from high-frequency waveforms substantially improved the prediction performance over simple statistical summaries of low-frequency time series, and each of the three feature classes contributed to the performance gain. The inclusion of long-term history up to 8 hours was especially important. Significance: Our results highlight the importance of information contained in high-frequency waveforms in the neurological intensive care unit. They could motivate future studies on pre-hypertensive patterns and the design of new alarm algorithms for critical events in the injured brain.

Authors Matthias Hüser, Adrian Kündig, Walter Karlen, Valeria De Luca, Martin Jaggi

Submitted Physiological Measurement

Francesco Locatello, Alp Yurtsever, Olivier Fercoq, Volkan Cevher Stochastic Conditional Gradient Method for Composite Convex Minimization NeurIPS 2019

Abstract In this paper, we propose the first practical algorithm to minimize stochastic composite optimization problems over compact convex sets. This template allows for affine constraints and therefore covers stochastic semidefinite programs (SDPs), which are vastly applicable in both machine learning and statistics. In this setup, stochastic algorithms with convergence guarantees are either not known or not tractable. We tackle this general problem and propose a convergent, easy to implement and tractable algorithm. We prove $\mathcal{O}(k^{-1/3})$ convergence rate in expectation on the objective residual and $\mathcal{O}(k^{-5/12})$ in expectation on the feasibility gap. These rates are achieved without increasing the batchsize, which can contain a single sample. We present extensive empirical evidence demonstrating the superiority of our algorithm on a broad range of applications including optimization of stochastic SDPs.

Authors Francesco Locatello, Alp Yurtsever, Olivier Fercoq, Volkan Cevher

Submitted NeurIPS 2019

Laura Manduchi, Matthias Hüser, Gunnar Rätsch, Vincent Fortuin DPSOM: Deep Probabilistic Clustering with Self-Organizing Maps arXiv Preprints

Abstract Generating visualizations and interpretations from high-dimensional data is a common problem in many applications. Two key approaches for tackling this problem are clustering and representation learning. On the one hand, there are very performant deep clustering models,such as DEC and IDEC. On the other hand, there are interpretable representation learning techniques, often relying on latent topological structures such as self-organizing maps. However, current methods do not yet successfully combine these two approaches. We present a novel way to fit self-organizing maps with probabilistic cluster assignments, PSOM, a new deep architecture for probabilistic clustering, DPSOM, and its extension to time series data, T-DPSOM. We show that they achieve superior clustering performance compared to current deep clustering methods on static MNIST/Fashion-MNIST data as well as medical time series, while also inducing an interpretable representation. Moreover, on medical time series, T-DPSOM successfully predicts future trajectories in the original data space.

Authors Laura Manduchi, Matthias Hüser, Gunnar Rätsch, Vincent Fortuin

Submitted arXiv Preprints

Date 06 Feb 2020

Date 12 Jun 2019

Date 10 Jan 2019