261-5100-00L Computational Biomedicine (Autumn 2021)

Semester Autumn Semester 2021
Lecturers Gunnar Rätsch; Valentina Boeva
Periodicity yearly course
Language of instruction English

Abstract
The course critically reviews central problems in Biomedicine and discusses the technical foundations and solutions for these problems.

Objective
Over the past years, rapid technological advancements have transformed classical disciplines such as biology and medicine into fields of apllied data science. While the sheer amount of the collected data often makes computational approaches inevitable for analysis, it is the domain specific structure and close relation to research and clinic, that call for accurate, robust and efficient algorithms. In this course we will critically review central problems in Biomedicine and will discuss the technical foundations and solutions for these problems.

Content
The course will consist of three topic clusters that will cover different aspects of data science problems in Biomedicine:

1) String algorithms for the efficient representation, search, comparison, composition and compression of large sets of strings, mostly originating from DNA or RNA Sequencing. This includes genome assembly, efficient index data structures for strings and graphs, alignment techniques as well as quantitative approaches.
2) Statistical models and algorithms for the assessment and functional analysis of individual genomic variations. this includes the identification of variants, prediction of functional effects, imputation and integration problems as well as the association with clinical phenotypes.
3) Models for organization and representation of large scale biomedical data. This includes ontolgy concepts, biomedical databases, sequence annotation and data compression.

Prerequisites / Notice
Data Structures & Algorithms, Introduction to Machine Learning, Statistics/Probability, Programming in Python, Unix Command Line.

Teaching Material

Lectures and Tutorials will be held fully virtual. Tuesdays 10-12 and 13-14 respectively.
The course will be given fully online via Zoom. Password can be found on Moodle.
Questions and group formation for projects will take place on Moodle.
Project development and submission will be done on Gitlab.

Course Overview

Date Topic Course Material
21.09.2021 Lecture: Introduction to the topic and patient genomics
Exercise: Organization and presentation of projects Tutorial Slides 01
28.09.2021 Lecture: String algorithms, indexing and search
Exercise: Tutorial String algorithms, indexing and search
05.10.2021 Lecture: Indexes of linear sequences and alignment
Exercise: Tutorial Indexes of linear sequences and alignment / Hand out of project 1
12.10.2021 Lecture: Variation-aware alignment, Indexes on graphs, succinct data structures
Exercise: Tutorial Variation-aware alignment, Indexes on graphs, succinct data structures
19.10.2021 Lecture: Transcript identification and quantification
Exercise: Tutorial Transcript identification and quantification
26.10.2021 Lecture: Differential Gene expression
Exercise: Tutorial Differential Gene expression
02.11.2021 Lecture: Single Cell expression data
Exercise: Hand in project 1 (due 11:59pm)/ Tutorial Single Cell expression data
09.11.2021 Lecture: Variant calling (germline)
Exercise: Hand out of project 2/ Tutorial Differential Gene expression case study
16.11.2021 Lecture: Linking genotypic information to clinical phenotypes
Exercise: Project 1 presentations
23.11.2021 Lecture: Variant interpretation and effect prediction
Exercise: Tutorial Variant effect
30.11.2021 Lecture: Ontologies and Variant Interpretation
Exercise: Tutorial Variant Calling (germline)
07.12.2021 Lecture: Sample Exam questions Q&A (10-11am; No Lecture 11am - 12pm)
Exercise: Hand in project 2 (due 11.59pm)/ Tutorial Variant Calling (somatic)
14.12.2021 No Lecture
Exercise: Tutorial Ontologies
21.12.2021 Lecture: Somatic Variants, Research talk, Summary of course, Q & A for exam
Exercise: Project 2 presentations