MCMC provides a powerful set of tools for inference in Bayesian models. However, for many applications to large scale problems, MCMC methods can be relatively inefficient compared to new deterministic approximations developed in the machine learning community. I will describe several modern and generally applicable deterministic algorithms for approximate inference, and mention their advantages and disadvantages compared to MCMC. To focus the talk, I will describe all algorithms in the context of inference in Dirichlet process mixtures (DPMs), a classical non-parametric Bayesian statistical model used to define mixture models with countably infinitely many components. In particular, I will cover the following algorithms for inference in DPMs: (1) the traditional Gibbs sampling MCMC algorithm, (2) Variational Bayesian (VB) approximations, (3) the Expectation Propagation (EP) algorithm, and (4) a new approximate inference method for DPMs based on Bayesian hierarchical clustering (BHC). All these algorithms provide different speed / accuracy tradeoffs for large-scale problems, and the underlying concepts can be applied to virtually any statistical model. My conclusion is the following: MCMC is but one of many ways of approaching intractable inference problems, and modern statistics is likely to benefit by broadening the toolbox to include novel inference methods arising from other communities.
Joint work with: Matthew Beal (U Buffalo), Katherine Heller (UCL), and Tom Minka (Microsoft).
- http://learning.eng.cam.ac.uk/zoubin/ - Zoubin Ghahramani's website