Machine Learning Research

The group is interested in understanding and developing machine learning models applicable to challenges posed by analysing biomedical data. With a long history in developing large-scale learning methods for sequence classification on genomes, and experience in clinical data analysis, the group seeks to advance state of the art in deep learning, probabilistic modelling, and time-series analysis.

Computational Genomics and Transcriptomics

With the advent of high throughput sequencing technologies, genomics and transcriptomics experienced a challenging renewal. Especially cancer research has greatly benefited from newly available data but has also helped to pose challenging computational problems. RNA-Sequencing enhanced transcriptome analysis and opened great opportunities for gene discovery and the identification of alternative transcripts. We made numerous contributions to the field in tackling problems such as isoform identification and quantification, differential expression analysis and the identification of alternative splicing events. DNA-Sequencing revolutionized the identification of genomic variants and allows to link changes in the genome to changes in other molecular phenotypes. We actively work on methods to accurately dertermine these phenotypes and to associate them to (somatic) alterations in patient genomes. 

Data Structures for Genome Representation

The availability of fast and affordable high-throughput DNA and RNA sequencing have transformed biology and medicine into research areas of data science. We are working on a compressed, distributed storage system for reference genomes and DNA sequencing data that dynamically scales to the various needs of individual research projects.

Comprehensive Patient Representations

The group develops innovative methods for the analysis of electronic health records (EHR) with the objective of automatically summarising patient states over time and doing predictive modelling. Ultimately, computational models of patient state can aid in decision support systems using EHR, integrating genomic information and other data sources to provide comprehensive, automated suggestions for treatments, prognoses, and to assist in clinical trial enrolment.