Probabilistic Modelling and Bayesian Deep Learning

Deep learning has revolutionised the field of machine learning and led to new possibilities in many areas of science. Neural networks often suffer from unreliable estimates and require human expertise to apply them to new data sets. Our research attempts to mitigate these limitations by fundamental research in probabilistic methods and Bayesian inference for deep learning.

Uncertainty estimation is necessary to make informed decisions in a dynamic environment and to decide when a machine learning system requires aid from an expert due to the risk of the wrong prediction. Our group has contributed to improving the uncertainty calibration of Bayesian neural networks [1,2], investigated the cold-posterior effect [3], and published software for research in and application of the proposed methods [4, 5].

Model selection for deep learning requires in-depth knowledge about the data at hand to construct architectures with the correct inductive biases, avoid overfitting, and achieve optimal generalisation. Such choices result in many hyperparameters that are costly to tune or require human expertise. Our research on modern Laplace approximations has enabled selecting the right inductive biases without validation data, for example, choosing the proper architectures [6], representations [7] and optimising invariances using gradients [8].

Latent-variable models enable learning lower-dimensional and more interpretable representations of data. Especially in the biomedical context, the data are often multimodal and temporal, which poses problems to standard approaches. To this end, our group has proposed advanced probabilistic latent-variables models to improve mutational signature learning [9], imputation in time series with VAEs [10], and interpretable representations using probabilistic deep self-organising maps [11, 12].

Involved group members: Alexander Immer, Gideon Dresdner, Vincent Fortuin (alumnus), Xinrui Lyu, Gunnar Rätsch

References
[1] Immer, Korzepa, Bauer. Improving predictions of Bayesian neural nets via local linearization. AISTATS 2021.
[2] D'Angelo, Fortuin. Repulsive Deep Ensembles are Bayesian. NeurIPS 2021.
[3] Fortuin*, Garriga-Alonso*, Ober, Wenzel, Rätsch, Turner, van der Wilk, Aitchison. Bayesian Neural Network Priors Revisited. ICLR 2022.
[4] Daxberger*, Kristiadi*, Immer*, Eschenhagen*, Bauer, Hennig. Laplace Redux -- Effortless Bayesian Deep Learning. NeurIPS 2021.
[5] Fortuin, Garriga-Alonso, van der Wilk, Aitchison. BNNpriors: A library for Bayesian neural network inference with different prior distributions. Software Impacts 2021.
[6] Immer, Bauer, Fortuin, Rätsch, Khan. Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning. ICML 2021.
[7] Immer, Hennigen, Fortuin, Cotterell. Probing as Quantifying Inductive Bias. ACL 2022.
[8] Immer*, van der Ouderaa*, Rätsch, Fortuin, van der Wilk. Invariance Learning in Deep Neural Networks with Differentiable Laplace Approximations. NeurIPS 2022.
[9] Lyu, Garret, Rätsch, Lehmann. Mutational signature learning with supervised negative binomial non-negative matrix factorization. Bioinformatics 2020.
[10] Fortuin, Baranchuk, Rätsch, Mandt. GP-VAE: Deep Probabilistic Time Series Imputation. AISTATS 2020.
[11] Fortuin, Hüser, Locatello, Strathmann, Rätsch. SOM-VAE: Interpretable Discrete Representation Learning on Time Series. ICLR 2019.
[12] Manduchi, Hüser, Faltys, Vogt, Rätsch, Fortuin. T-DPSOM - An Interpretable Clustering Method for Unsupervised Learning of Patient Health States. ACM CHIL.