261-5112-00L Algorithms and Data Structures for Population Scale Genomics (Spring 2022)

Semester Spring Semester 2022
Lecturers A. Kahles
Periodicity yearly course
Language of instruction English

Research in Biology and Medicine have been transformed into disciplines of applied data science over the past years. Not only size and inherent complexity of the data but also requirements on data privacy and complexity of search and access pose a wealth of new research questions.

This interactive block course will explore the latest research on algorithms and data structures for population scale genomics applications and give insights into both the technical basis as well as the domain questions motivating it.

Over the duration of one week, the course will cover several key algorithmic topics. Each of the topics will consist of 50% lecture content and 50% interactive work.

1) Algorithms and data structures for text and graph compression. Motivated through applications in compressive genomics, the course will cover succinct indexing schemes for strings, trees and general graphs, compression schemes for binary matrices as well as the efficient representation of haplotypes and genomic variants.

2) Stochastic data structures and algorithms for approximate representation of strings and graphs as well as sets in general. This includes winnowing schemes and minimizers, sketching techniques, (minimal perfect) hashing and approximate membership query data structures.

Prerequisites / Notice
Data Structures & Algorithms


The lecture is scheduled to be held at ETH in LFW B 3 (link to location).

Course Overview

All course contents will be made available to the participants via Moodle.