261-5112-00L Algorithms and Data Structures for Population Scale Genomics (Autumn 2022)

Semester Autumn Semester 2022
Lecturers A. Kahles
Periodicity yearly course
Language of instruction English

Abstract
Research in Biology and Medicine have been transformed into disciplines of applied data science over the past years. Not only size and inherent complexity of the data but also requirements on data privacy and complexity of search and access pose a wealth of new research questions.

Objective
This interactive semester course will explore the latest research on algorithms and data structures for population scale genomics applications and give insights into both the technical basis as well as the domain questions motivating it.

Content
Over the duration of one week, the course will cover several key algorithmic topics. Each of the topics will consist of 50% lecture content and 50% interactive work.

1) Algorithms and data structures for text and graph compression. Motivated through applications in compressive genomics, the course will cover succinct indexing schemes for strings, trees and general graphs, compression schemes for binary matrices as well as the efficient representation of haplotypes and genomic variants.

2) Stochastic data structures and algorithms for approximate representation of strings and graphs as well as sets in general. This includes winnowing schemes and minimizers, sketching techniques, (minimal perfect) hashing and approximate membership query data structures.

Prerequisites / Notice
Data Structures & Algorithms

Location

The lecture will be held at ETH in CHN D 48 (link to location). Wednesdays 14.15-16.

Course Overview

All course contents will be made available to the participants via Moodle.