Xinrui Lyu, MSc. EPFL in Electrical Engineering

"Problems worthy of attack prove their worth by fighting back." - Paul Erdos (1913-1996)

PhD Student

+41 44 632 23 74
ETH Zürich
Department of Computer Science
Biomedical Informatics Group Universitätsstrasse 6
8092 Zürich

I joined Prof. Gunnar Rätsch's group in October 2016. My research interests lie in machine learning on medical time series and imaging.

Before joining the Biomedical Informatics Group at ETHZ, I received my M.Sc. in Electrical Engineering from École Polytechnique Fédérale de Lausanne (EPFL), Switzerland, and my B.Eng. in Electronic Engineering from Tsinghua University, China. I have interned at Technicolor R&I Center, France where I worked on image searching algorithm for six months in 2015.

Abstract Motivation Understanding the underlying mutational processes of cancer patients has been a long-standing goal in the community and promises to provide new insights that could improve cancer diagnoses and treatments. Mutational signatures are summaries of the mutational processes, and improving the derivation of mutational signatures can yield new discoveries previously obscured by technical and biological confounders. Results from existing mutational signature extraction methods depend on the size of available patient cohort and solely focus on the analysis of mutation count data without considering the exploitation of metadata. Results Here we present a supervised method that utilizes cancer type as metadata to extract more distinctive signatures. More specifically, we use a negative binomial non-negative matrix factorization and add a support vector machine loss. We show that mutational signatures extracted by our proposed method have a lower reconstruction error and are designed to be more predictive of cancer type than those generated by unsupervised methods. This design reduces the need for elaborate post-processing strategies in order to recover most of the known signatures unlike the existing unsupervised signature extraction methods. Signatures extracted by a supervised model used in conjunction with cancer-type labels are also more robust, especially when using small and potentially cancer-type limited patient cohorts. Finally, we adapted our model such that molecular features can be utilized to derive an according mutational signature. We used APOBEC expression and MUTYH mutation status to demonstrate the possibilities that arise from this ability. We conclude that our method, which exploits available metadata, improves the quality of mutational signatures as well as helps derive more interpretable representations.

Authors Xinrui Lyu, Jean Garret, Gunnar Rätsch, Kjong-Van Lehmann

Submitted Bioinformatics

Link DOI

Abstract Intensive-care clinicians are presented with large quantities of measurements from multiple monitoring systems. The limited ability of humans to process complex information hinders early recognition of patient deterioration, and high numbers of monitoring alarms lead to alarm fatigue. We used machine learning to develop an early-warning system that integrates measurements from multiple organ systems using a high-resolution database with 240 patient-years of data. It predicts 90% of circulatory-failure events in the test set, with 82% identified more than 2 h in advance, resulting in an area under the receiver operating characteristic curve of 0.94 and an area under the precision-recall curve of 0.63. On average, the system raises 0.05 alarms per patient and hour. The model was externally validated in an independent patient cohort. Our model provides early identification of patients at risk for circulatory failure with a much lower false-alarm rate than conventional threshold-based systems.

Authors Stephanie L. Hyland, Martin Faltys, Matthias Hüser, Xinrui Lyu, Thomas Gumbsch, Cristóbal Esteban, Christian Bock, Max Horn, Michael Moor, Bastian Rieck, Marc Zimmermann, Dean Bodenham, Karsten Borgwardt, Gunnar Rätsch & Tobias M. Merz

Submitted Nature Medicine


Abstract In this work, we investigate unsupervised representation learning on medical time series, which bears the promise of leveraging copious amounts of existing unlabeled data in order to eventually assist clinical decision making. By evaluating on the prediction of clinically relevant outcomes, we show that in a practical setting, unsupervised representation learning can offer clear performance benefits over end-to-end supervised architectures. We experiment with using sequence-to-sequence (Seq2Seq) models in two different ways, as an autoencoder and as a forecaster, and show that the best performance is achieved by a forecasting Seq2Seq model with an integrated attention mechanism, proposed here for the first time in the setting of unsupervised learning for medical time series.

Authors Xinrui Lyu, Matthias Hüser, Stephanie L. Hyland, George Zerveas, Gunnar Rätsch

Submitted Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 - Spotlight


Abstract The deterioration of organ function in ICU patients requires swift response to prevent further damage to vital systems. Focusing on the circulatory system, we build a model to predict if a patient’s state will deteriorate in the near future. We identify circulatory system dys- function using the combination of excess lactic acid in the blood and low mean arterial blood pressure or the presence of vasoactive drugs. Using an observational cohort of 45,000 patients from a Swiss ICU, we extract and process patient time series and identify periods of circulatory system dysfunction to develop an early warning system. We train a gra- dient boosting model to perform binary classification every five minutes on whether the patient will deteriorate during an increasingly large win- dow into the future, up to the duration of a shift (8 hours). The model achieves an AUROC between 0.952 and 0.919 across the prediction win- dows, and an AUPRC between 0.223 and 0.384 for events with positive prevalence between 0.014 and 0.042. We also show preliminary results from a recurrent neural network. These results show that contemporary machine learning approaches combined with careful preprocessing of raw data collected during routine care yield clinically useful predictions in near real time [Workshop Abstract]

Authors Stephanie Hyland, Matthias Hüser, Xinrui Lyu, Martin Faltys, Tobias Merz, Gunnar Rätsch

Submitted Proceedings of the First Joint Workshop on AI in Health


Abstract In this work, we propose a framework, dubbed Union-of-Subspaces SVM (US-SVM), to learn linear classifiers as sparse codes over a learned dictionary. In contrast to discriminative sparse coding with a learned dictionary, it is not the data but the classifiers that are sparsely encoded. Experiments in visual categorization demonstrate that, at training time, the joint learning of the classifiers and of the over-complete dictionary allows the discovery and sharing of mid-level attributes. The resulting classifiers further have a very compact representation in the learned dictionaries, offering substantial performance advantages over standard SVM classifiers for a fixed representation sparsity. This high degree of sparsity of our classifier also provides computational gains, especially in the presence of numerous classes. In addition, the learned atoms can help identify several intra-class modalities.

Authors Xinrui Lyu, Joaquin Zepeda and Patrick Perez

Submitted Proceedings of the British Machine Vision Conference (BMVC)

Link DOI

Abstract This paper presents an approach for using hierarchically structured multi-view features for mobile visual search. We utilize a graph model to describe the feature correspondences between multi-view images. To add features of images from new viewpoints, we designa level raising algorithm and the associated multi-view geometric verification, which are based on the properties of the hierarchical structure. With this approach, features from new viewpoints can be recursively added in an incremental fashion. Additionally, we designa query matching strategy which utilizes the advantage of the hierarchical structure. The experimental results show that our structure of the multi-view feature database can efficiently improve the performance of mobile visual search.

Authors X. Lyu, H. Li and M. Flierl

Submitted 2014 Data Compression Conference

Link DOI