261-5120-00L Machine Learning for Health Care (Spring 2020)

Semester Spring Semester 2020
Lecturers G. Rätsch, J. Vogt, V. Boeva
Periodicity yearly course
Language of instruction English

Abstract
The course will review the most relevant methods and applications of Machine Learning in Biomedicine, discuss the main challenges they present and their current technical problems.

Objective
During the last years, we have observed a rapid growth in the field of Machine Learning (ML), mainly due to improvements in ML algorithms, the increase of data availability and a reduction in computing costs. This growth is having a profound impact in biomedical applications, where the great variety of tasks and data types enables us to get benefit of ML algorithms in many different ways. In this course we will review the most relevant methods and applications of ML in biomedicine, discuss the main challenges they present and their current technical solutions.

Content
The course will consist of four topic clusters that will cover the most relevant applications of ML in Biomedicine:

1) Structured time series: Temporal time series of structured data often appear in biomedical datasets, presenting challenges as containing variables with different periodicities, being conditioned by static data, etc.
2) Medical notes: Vast amount of medical observations are stored in the form of free text, we will analyze stategies for extracting knowledge from them.
3) Medical images: Images are a fundamental piece of information in many medical disciplines. We will study how to train ML algorithms with them.
4) Genomics data: ML in genomics is still an emerging subfield, but given that genomics data are arguably the most extensive and complex datasets that can be found in biomedicine, it is expected that many relevant ML applications will arise in the near future. We will review and discuss current applications and challenges.

Prerequisites / Notice
Data Structures & Algorithms, Introduction to Machine Learning, Statistics/Probability, Programming in Python, Unix Command Line

Relation to Course 261-5100-00 Computational Biomedicine: This course is a continuation of the previous course with new topics related to medical data and machine learning. The format of Computational Biomedicine II will also be different. It is helpful but not essential to attend Computational Biomedicine before attending Computational Biomedicine II.

Location

The lecture will be held at ETH in ETF C 1.

Course Overview

Date Topic Course Materials
20.02.2020 Introduction Lecture Slides 01
Tutorial Slides 01
27.02.2020 Sequence Analysis and Time Series Lecture Slides 02
Tutorial Slides 02
05.03.2020 Survival Analysis Lecture Slides 03
Tutorial Slides 03
Paper Presentation 1. Deep convolutional neural network for the automated detection and diagnosis of seizure using EEG signals
2. Multitask Gaussian Processes for Multivariate Physiological Time-Series Analysis
12.03.2020 Natural Language Processing of Clinical Text Lecture Slides 04 [Video 04]
Tutorial Notebook 04
Paper Topics [Survival Analysis]
Paper Topics [NLP]
19.03.2020 Representation Learning Lecture Slides 05 [Video 05]
Paper Topics [NLP]
26.03.2020 TBD
Paper Topics [Representation Learning]
02.04.2020 Ethics and Big Data Lecture Slides 06 [Video 06 (password: DMII2020)]
09.04.2020 Privacy Preserving Computing Lecture Slides 07 [Video 07]
Paper Topics [Ethics]
16.04.2020 [No Class; Easter Break]
23.04.2020 Medical Imaging Analysis Lecture Slides 08 [Video 08]
Paper Topics [Privacy]
30.04.2020 Interpretability of Machine Learning Models Lecture Slides 09 [Video 09]
Paper Topics [Medical Imaging]
07.05.2020 Supervised Methods for Genetics and Transcriptomics Lecture Slides 10 [JupyterHub Exercise, Video 10]
Paper Topics [Interpretability]
14.05.2020 Unupervised Methods for Genetics and Transcriptomics Lecture Slides 11 [JupyterHub Exercise, Video 11]
Paper Topics [Genetics]
21.05.2020 [No Class; Ascension Day]
28.05.2020 Exam examples and feedbacks Lecture Slides 12
Exam question examples
Paper Topics [Genetics]

Projects

Project 1: ECG time series
Deadline: 18.03.2020
Description: see slides of tutorial 2 (pp 31-44)
Data: download here
Codeshare: link to polybox (password sent by email)

 

Project 2: NLP tasks
Deadline: 09.04.2020
Project 2 description and data

 

Project 3: Image segmentation
Deadline: 29.04.2020
Project 3 description and data
Test Labels: download here

 

Project 4: Splice site prediction
Deadline: 20.05.2020
Project 4 description and data