Francesco Locatello, Rajiv Khanna, Joydeep Ghosh, Gunnar Rätsch
Boosting variational inference is a new research trend which can be seen as a convex optimization problem in a function space. We explore this connection and show how to provably solve it using the classical Frank-Wolfe algorithm.
Abstract: Variational inference is a popular technique to approximate a possibly intractable Bayesian posterior with a more tractable one. Recently, boosting variational inference [20, 4] has been proposed as a new paradigm to approximate the posterior by a mixture of densities by greedily adding components to the mixture. However, as is the case with many other variational inference algorithms, its theoretical properties have not been studied. In the present work, we study the convergence properties of this approach from a modern optimization viewpoint by establishing connections to the classic Frank-Wolfe algorithm. Our analyses yields novel theoretical insights regarding the sufficient conditions for convergence, explicit rates, and algorithmic simplifications. Since a lot of focus in previous works for variational inference has been on tractability, our work is especially important as a much needed attempt to bridge the gap between probabilistic models and their corresponding theoretical properties.
(presented at AISTATS)