Francesco Locatello, MSc. ETH Computer Science

“R2D2, you know better than to trust a strange computer! - cit. C3PO”

PhD Student

E-Mail
locatelf@get-your-addresses-elsewhere.ethz.ch
Phone
+41 44 632 23 74
Address
ETH Zürich
Department of Computer Science
Biomedical Informatics Group Universitätsstrasse 6
CAB F52.2
8092 Zürich
Room
CAB F52.2

My research focuses on constrained greedy optimization and boosting.

I graduated cum laude from the university of Padua in Information Engineering and then joined ETH for my master in Computer Science. During this time I became passionate about Machine Learning and I am now part of the Max Planck-ETH Center for Learning Systems, supervised by Gunnar Rätsch and Bernhard Schölkopf. I am very passionate about optimization for machine learning.

Abstract Variational Inference is a popular technique to approximate a possibly intractable Bayesian posterior with a more tractable one. Recently, Boosting Variational Inference has been proposed as a new paradigm to approximate the posterior by a mixture of densities by greedily adding components to the mixture. In the present work, we study the convergence properties of this approach from a modern optimization viewpoint by establishing connections to the classic Frank-Wolfe algorithm. Our analyses yields novel theoretical insights on the Boosting of Variational Inference regarding the sufficient conditions for convergence, explicit sublinear/linear rates, and algorithmic simplifications.

Authors Francesco Locatello, Rajiv Khanna, Joydeep Ghosh, Gunnar Rätsch

Submitted submitted

Link DOI

Abstract Greedy optimization methods such as Matching Pursuit (MP) and Frank-Wolfe (FW) algorithms regained popularity in recent years due to their simplicity, effectiveness and theoretical guarantees. MP and FW address optimization over the linear span and the convex hull of a set of atoms, respectively. In this paper, we consider the intermediate case of optimization over the convex cone, parametrized as the conic hull of a generic atom set, leading to the first principled definitions of non-negative MP algorithms for which we give explicit convergence rates and demonstrate excellent empirical performance. In particular, we derive sublinear (O(1/t)) convergence on general smooth and convex objectives, and linear convergence (O(e−t)) on strongly convex objectives, in both cases for general sets of atoms. Furthermore, we establish a clear correspondence of our algorithms to known algorithms from the MP and FW literature. Our novel algorithms and analyses target general atom sets and general objective functions, and hence are directly applicable to a large variety of learning settings.

Authors Francesco Locatello, Michael Tschannen, Gunnar Rätsch, Martin Jaggi

Submitted NIPS 2017

Link DOI

Abstract Two of the most fundamental prototypes of greedy optimization are the matching pursuit and FrankWolfe algorithms. In this paper, we take a unified view on both classes of methods, leading to the first explicit convergence rates of matching pursuit methods in an optimization sense, for general sets of atoms. We derive sublinear (1/t) convergence for both classes on general smooth objectives, and linear convergence on strongly convex objectives, as well as a clear correspondence of algorithm variants. Our presented algorithms and rates are affine invariant, and do not need any incoherence or sparsity assumptions.

Authors Francesco Locatello, Rajiv Khanna, Michael Tschannen, Martin Jaggi

Submitted AISTATS 2017

Link DOI

Abstract Two of the most fundamental prototypes of greedy optimization are the matching pursuit and Frank-Wolfe algorithms. In this paper, we take a unified view on both classes of methods, leading to the first explicit convergence rates of matching pursuit methods in an optimization sense, for general sets of atoms. We derive sublinear (1/t) convergence for both classes on general smooth objectives, and linear convergence on strongly convex objectives, as well as a clear correspondence of algorithm variants. Our presented algorithms and rates are affine invariant, and do not need any incoherence or sparsity assumptions.

Authors Francesco Locatello, Rajiv Khanna, Michael Tschannen, Martin Jaggi

Submitted arXiv

Link