Francesco Locatello, MSc. ETH Computer Science

“R2D2, you know better than to trust a strange computer!” - cit. C3PO

Alumni

E-Mail: locatelf@get-your-addresses-elsewhere.ethz.ch

I am interested in representation learning and causality. Before, I also worked on convex optimization and Bayesian inference.

I am a doctoral fellow of the Max Planck-ETH Center for Learning Systems and ELLIS, supervised by Gunnar Rätsch and Bernhard Schölkopf. Before, I graduated cum laude from the University of Padua in Information Engineering and then joined ETH for my master in Computer Science. I hold a Google PhD Fellowship in Machine Learning, received the best paper award at the International Conference of Machine Learning (ICML) 2019, and the ISBA@NIPS award at Advances in Approximate Bayesian Inference Workshop NIPS 2017. I was part of the organizing team of the NeurIPS 2019 challenge: "Disentanglement: From Simulation to Real-World". During my PhD I worked part-time as a research consultant for ETH and MPI (in collaboration with Google Research, Brain Team, Zurich) and interned in Google Research, Brain Team Amsterdam.

Gideon Dresdner, Maria-Luiza Vladarean, Gunnar Rätsch, Francesco Locatello, Volkan Cevher, Alp Yurtsever Faster One-Sample Stochastic Conditional Gradient Method for Composite Convex Minimization Proceedings of The 25th International Conference on Artificial Intelligence and Statistics (AISTATS-22)

Abstract We propose a stochastic conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms. Existing CGM variants for this template either suffer from slow convergence rates, or require carefully increasing the batch size over the course of the algorithm’s execution, which leads to computing full gradients. In contrast, the proposed method, equipped with a stochastic average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques. In applications we put special emphasis on problems with a large number of separable constraints. Such problems are prevalent among semidefinite programming (SDP) formulations arising in machine learning and theoretical computer science. We provide numerical experiments on matrix completion, unsupervised clustering, and sparsest-cut SDPs.

Authors Gideon Dresdner, Maria-Luiza Vladarean, Gunnar Rätsch, Francesco Locatello, Volkan Cevher, Alp Yurtsever

Submitted Proceedings of The 25th International Conference on Artificial Intelligence and Statistics (AISTATS-22)

Link

Hugo Yèche, Gideon Dresdner, Francesco Locatello, Matthias Hüser, Gunnar Rätsch Neighborhood Contrastive Learning Applied to Online Patient Monitoring ICML 2021

Abstract Intensive care units (ICU) are increasingly looking towards machine learning for methods to provide online monitoring of critically ill patients. In machine learning, online monitoring is often formulated as a supervised learning problem. Recently, contrastive learning approaches have demonstrated promising improvements over competitive supervised benchmarks. These methods rely on well-understood data augmentation techniques developed for image data which do not apply to online monitoring. In this work, we overcome this limitation by supplementing time-series data augmentation techniques with a novel contrastive learning objective which we call neighborhood contrastive learning (NCL). Our objective explicitly groups together contiguous time segments from each patient while maintaining state-specific information. Our experiments demonstrate a marked improvement over existing work applying contrastive methods to medical time-series.

Authors Hugo Yèche, Gideon Dresdner, Francesco Locatello, Matthias Hüser, Gunnar Rätsch

Submitted ICML 2021

Link

Gideon Dresdner, Saurav Shekhar, Fabian Pedregosa, Francesco Locatello, Gunnar Rätsch Boosting Variational Inference With Locally Adaptive Step-Sizes International Joint Conference on Artificial Intelligence (IJCAI-21)

Abstract Variational Inference makes a trade-off between the capacity of the variational family and the tractability of finding an approximate posterior distribution. Instead, Boosting Variational Inference allows practitioners to obtain increasingly good posterior approximations by spending more compute. The main obstacle to widespread adoption of Boosting Variational Inference is the amount of resources necessary to improve over a strong Variational Inference baseline. In our work, we trace this limitation back to the global curvature of the KL-divergence. We characterize how the global curvature impacts time and memory consumption, address the problem with the notion of local curvature, and provide a novel approximate backtracking algorithm for estimating local curvature. We give new theoretical convergence rates for our algorithms and provide experimental validation on synthetic and real-world datasets.

Authors Gideon Dresdner, Saurav Shekhar, Fabian Pedregosa, Francesco Locatello, Gunnar Rätsch

Submitted International Joint Conference on Artificial Intelligence (IJCAI-21)

Link

Stefan G Stark, Joanna Ficek-Pascual, Francesco Locatello, Ximena Bonilla, Stéphane Chevrier, Franziska Singer, Tumor Profiler Consortium, Gunnar Rätsch, Kjong-Van Lehmann SCIM: universal single-cell matching with unpaired feature sets Bioinformatics

Abstract Motivation Recent technological advances have led to an increase in the production and availability of single-cell data. The ability to integrate a set of multi-technology measurements would allow the identification of biologically or clinically meaningful observations through the unification of the perspectives afforded by each technology. In most cases, however, profiling technologies consume the used cells and thus pairwise correspondences between datasets are lost. Due to the sheer size single-cell datasets can acquire, scalable algorithms that are able to universally match single-cell measurements carried out in one cell to its corresponding sibling in another technology are needed. Results We propose Single-Cell data Integration via Matching (SCIM), a scalable approach to recover such correspondences in two or more technologies. SCIM assumes that cells share a common (low-dimensional) underlying structure and that the underlying cell distribution is approximately constant across technologies. It constructs a technology-invariant latent space using an autoencoder framework with an adversarial objective. Multi-modal datasets are integrated by pairing cells across technologies using a bipartite matching scheme that operates on the low-dimensional latent representations. We evaluate SCIM on a simulated cellular branching process and show that the cell-to-cell matches derived by SCIM reflect the same pseudotime on the simulated dataset. Moreover, we apply our method to two real-world scenarios, a melanoma tumor sample and a human bone marrow sample, where we pair cells from a scRNA dataset to their sibling cells in a CyTOF dataset achieving 90% and 78% cell-matching accuracy for each one of the samples, respectively.

Authors Stefan G Stark, Joanna Ficek-Pascual, Francesco Locatello, Ximena Bonilla, Stéphane Chevrier, Franziska Singer, Tumor Profiler Consortium, Gunnar Rätsch, Kjong-Van Lehmann

Submitted Bioinformatics

Francesco Locatello, MSc. ETH Computer Science

Alumni

Latest Publications