Vincent Fortuin, MSc CBB ETH UZH

"The scientist is not a person who gives the right answers, he's one who asks the right questions." - Claude Lévi-Strauss

PhD Student

+41 44 633 66 87
ETH Zürich
Department of Computer Science
Biomedical Informatics Group
Universitätsstrasse 6
CAB F 53.1
8006 Zürich
CAB F 51.2

I am interested in the interface between deep learning and probabilistic modeling. I am particularly keen to develop models that are more interpretable and data efficient, since these are two major requirements in the field of health care.

I did my undergraduate studies in Molecular Life Sciences at the University of Hamburg, where I worked on phylogeny inference for quickly mutating virus strains with Andrew Torda. I then went to ETH Zürich to study Computational Biology and Bioinformatics, in a joint program with the University of Zürich, with a focus on systems biology and machine learning. My master's thesis was about the application of deep learning to gene regulatory network inference under supervision of Manfred Claassen. During my studies I also spent some time in Jacob Hanna's group at the Weizmann Institute of Science, working on multiomics data analysis in stem cell research. Before joining the Biomedical Informatics group as a PhD student, I worked on deep learning applications in natural language understanding at Disney Research.

Abstract Human professionals are often required to make decisions based on complex multivariate time series measurements in an online setting, e.g. in health care. Since human cognition is not optimized to work well in high-dimensional spaces, these decisions benefit from interpretable low-dimensional representations. However, many representation learning algorithms for time series data are difficult to interpret. This is due to non-intuitive mappings from data features to salient properties of the representation and non-smoothness over time. To address this problem, we propose to couple a variational autoencoder to a discrete latent space and introduce a topological structure through the use of self-organizing maps. This allows us to learn discrete representations of time series, which give rise to smooth and interpretable embeddings with superior clustering performance. Furthermore, to allow for a probabilistic interpretation of our method, we integrate a Markov model in the latent space. This model uncovers the temporal transition structure, improves clustering performance even further and provides additional explanatory insights as well as a natural representation of uncertainty. We evaluate our model on static (Fashion-)MNIST data, a time series of linearly interpolated (Fashion-)MNIST images, a chaotic Lorenz attractor system with two macro states, as well as on a challenging real world medical time series application. In the latter experiment, our representation uncovers meaningful structure in the acute physiological state of a patient.

Authors Vincent Fortuin, Matthias Hüser, Francesco Locatello, Heiko Strathmann, Gunnar Rätsch

Submitted Arxiv