An Automated Statistician which learns Bayesian nonparametric models of time series data
Seminar Room 1, Newton Institute
AbstractI will describe the "Automated Statistician", a project which aims to automate the exploratory analysis and modelling of data. Our approach starts by defining a large space of related probabilistic models via a grammar over models, and then uses Bayesian marginal likelihood computations to search over this space for one or a few good models of the data. The aim is to find models which have both good predictive performance, and are somewhat interpretable. Our initial work has focused on the learning of unknown nonparametric regression functions, and on learning models of time series data, both using Gaussian processes. Once a good model has been found, the Automated Statistician generates a natural language summary of the analysis, producing a 10-15 page report with plots and tables describing the analysis. I will focus in particular on the modelling of time series, including how we handle change points in Gaussian process models. I will also discuss challenges su ch as: how to trade off predictive performance and interpretability, how to translate complex statistical concepts into natural language text that is understandable by a numerate non-statistician, and how to integrate model checking.
This is joint work with James Lloyd and David Duvenaud (Cambridge) and Roger Grosse and Josh Tenenbaum (MIT).
If it doesn't, something may have gone wrong with our embedded player.
We'll get it fixed as soon as possible.