# Seminars (MCM)

Videos and presentation materials from other INI events are also available.

Search seminar archive

Event When Speaker Title Presentation Material
MCMW01 22nd April 2014
09:15 to 10:15
tba
MCMW01 22nd April 2014
10:30 to 11:05
Sequential Quasi-Monte Carlo
Co-author: Mathieu Gerber (Université de Lausanne and CREST)

We develop a new class of algorithms, SQMC (Sequential Quasi-Monte Carlo), as a variant of SMC (Sequential Monte Carlo) based on low-discrepancy points. The complexity of SQMC is O(N\log N), where N is the number of simulations at each iteration, and its error rate is smaller than the Monte Carlo rate O(N^{-1/2}). The only requirement to implement SQMC is the ability to write the simulation of particle x^n_t given x^n_{t-1} as a deterministic function of x^n_{t-1} and uniform variates. We show that SQMC is amenable to the same extensions as standard SMC, such as forward smoothing, backward smoothing, unbiased likelihood evaluation, and so on. In particular, SQMC may replace SMC within a PMCMC (particle Markov chain Monte Carlo) algorithm. We establish several convergence results. We provide numerical evidence in several difficult scenarios than SQMC significantly outperforms SMC in terms of approximation error.

MCMW01 22nd April 2014
11:05 to 11:40
E Moulines On the uniform ergodicity of the particle Gibbs sampler
Co-authors: Randal Douc (Telecom SudParis), Fred Lindsten (Cambridge)

The particle Gibbs sampler is a systematic way of using a particle filter within Markov chain Monte Carlo (MCMC). This results in an off-the-shelf Markov kernel on the space of state trajectories, which can be used to simulate from the full joint smoothing distribution for a state space model in an MCMC scheme. We show that the PG Markov kernel is uniformly ergodic under rather general assumptions, that we will carefully review and discuss. In particular, we provide an explicit rate of convergence which reveals that: (i) for fixed number of data points, the convergence rate can be made arbitrarily good by increasing the number of particles, and (ii) under general mixing assumptions, the convergence rate can be kept constant by increasing the number of particles superlinearly with the number of observations. We illustrate the applicability of our result by studying in detail two common state space models with non-compact state spaces.

MCMW01 22nd April 2014
11:40 to 12:15
Speeding-up Pseudo-marginal MCMC using a surrogate model
The pseudo-marginal MCMC algorithm is a powerful tool for exploring the posterior distribution when only unbiased stochastic estimates of the target density are available. In many situations, although there is no closed-form expression for this density, computationally cheap deterministic estimates are also available; one can then use a delayed-acceptance strategy for exploiting these cheap approximations.

As powerful as they are, the use of such algorithms are difficult in practice: it involves tuning the MCMC proposals and choosing the computational budget that one is willing to invest in the creation of the unbiased estimates while taking into account the quality of the cheap deterministic approximations. In this talk we discuss how high-dimensional asymptotic results can help in the tuning of these delayed-acceptance pseudo-marginal MCMC algorithms.

This is joint work with Chris Sherlock and Andrew Golightly.

MCMW01 22nd April 2014
13:45 to 14:20
M Pitt Efficient implementation of Markov chain Monte Carlo when using an unbiased likelihood estimator
When an unbiased estimator of the likelihood is used within an Metropolis-Hastings scheme, it is necessary to tradeoff the number of samples used to evaluate the likelihood against the computing time. Many samples will result in a scheme which has similar properties to the case where the likelihood is exactly known but will be expensive. Few samples will result in faster estimation but at the expense of slower mixing of the Markov chain. We explore the relationship between the number of samples and the efficiency of the resulting Metropolis-Hastings estimates. Under the assumption that the distribution of the additive noise introduced by the log-likelihood estimator is independent of the point at which this log-likelihood is evaluated and other relatively mild assumptions, we provide guidelines on the number of samples to select for a general Metropolis-Hastings proposal. We illustrate on a complex stochastic volatility model that these assumptions are approximately satisfied experimentally and that the theoretical insights with regards to inefficiency and computational time hold true.

Keywords: Bayesian inference; Estimated likelihood; Metropolis-Hastings; Particle filtering.

MCMW01 22nd April 2014
14:20 to 14:55
S Reich Particle filters for infinite-dimensional systems: combining localization and optimal transportation
Co-author: Yuan Cheng (University of Potsdam)

Particle filters or sequential Monte Carlo methods are powerful tools for adjusting model state to data. However they suffer from the curse of dimensionality and have not yet found wide-spread application in the context of spatio-temporal evolution models. On the other hand, the ensemble Kalman filter with its simple Gaussian approximation has successfully been applied to such models using the concept of localization. Localization allows one to account for a spatial decay of correlation in a filter algorithm. In my talk, I will propose novel particle filter implementations which are suitable for localization and, as the ensemble Kalman filter, fit into the broad class of linear transform filters. In case of a particle filter this transformation will be determined by ideas from optimal transportation while in case of the ensemble Kalman filter one essentially relies on the linear Kalman update formulas. This common framework also allows for a mixture of particle and ensemble Kalman filters. Numerical results will be provided for the Lorenz-96 model which is a crude model for nonlinear advection.

MCMW01 22nd April 2014
14:55 to 15:30
The Filtering Distribution For Partially Observed Chaotic Dynamical Systems
Co-author: Daniel Sanz (University of Warwick)

Many physical systems can be successfully modelled by a deterministic dynamical system for which, however, the initial conditions may contain uncertainty. In the presence of chaos this can lead to undesirable growth of uncertainty over time. However, when noisy observations of the system are present these may be used to compensate for the uncertainty in the initial state. This scenario is naturally modelled by viewing the initial state as given by a probability distribution, and to then condition this probability distribution on the noisy observations, thereby reducing uncertainty. Filtering refers to the situation where the conditional distribution on the system state is updated sequentially, at the time of each observation. In this talk we investigate the asymptotic behaviour of this filtering distribution for large time. We focus on a class of dissipative systems that includes the Lorenz '63 and '96 models, and the Navier-Stokes equations on a 2D torus. We first st udy the behaviour of a variant on the 3DVAR filter, creating a unified analysis which subsumes the existing work in [1,2] which, itself, builds on [3]. The optimality property of the true filtering distribution is then used, when combined with this modified 3DVAR analysis, to provide general conditions on the observation of our wide class of chaotic dissipative systems which ensure that the filtering distributions concentrate around the true state of the underlying system in the long-time asymptotic regime.

[1] C.E.A. Brett, K.F. Lam, K.J.H. Law, D.S. McCormick, M.R. Scott and A.M. Stuart, Accuracy and stability of filters for dissipative PDEs.'' Physica D 245(2013).

[2] K.J.H. Law, A. Shukla and A.M. Stuart, Analysis of the 3DVAR Filter for the Partially Observed Lorenz '63 Model.'' Discrete and Continuous Dynamical Systems A, 34(2014).

[3] K. Hayden, E. Olsen and E.S. Titi, Discrete data assimilation in the Lorenz and 2D Navier-Stokes equations.'' Physica D 240(2011).

MCMW01 22nd April 2014
15:50 to 16:25
Co-authors: Christophe Andrieu (University of Bristol), Arnaud Doucet (University of Oxford)

In various situations of interest, natural implementations of Monte Carlo algorithms such as Markov chain Monte Carlo and sequential Monte Carlo can perform poorly due to uneven performance in different parts of the space in which they operate. For example, in Markov chain Monte Carlo a Markov kernel may behave increasingly poorly in the tails of the target distribution of interest and in sequential Monte Carlo the quality of associated estimates may plummet if too few particles are used at a particular time. We overview a particular strategy, local adaptation, that seeks to overcome some of these phenomena in practice.

MCMW01 22nd April 2014
16:25 to 17:00
Particle filtering subject to interaction constraints
Co-authors: Kari Heine (Bristol), Taylan Cemgil (Bogazici), Anthony Lee (Warwick)

The potential benefits of parallel or distributed implementation motivates study of the interaction structure of particle filters. Can we do away with resampling, or at least re-structure it in such a way as to be more naturally suited to non-serial implementation? What role does resampling really play in endowing these algorithms with attractive properties? This talk will introduce some new algorithms in this context and discuss their theoretical properties.

MCMW01 23rd April 2014
09:15 to 10:15
ABC methods for Bayesian model choice
Approximate Bayesian computation (ABC), also known as likelihood-free methods, have become a standard tool for the analysis of complex models, primarily in population genetics. The development of new ABC methodology is undergoing a rapid increase in the past years, as shown by multiple publications, conferences and even software. While one valid interpretation of ABC based estimation is connected with nonparametrics, the setting is quite different for model choice issues. We examined in Grelaud et al. (2009, Bayesian Analysis) the use of ABC for Bayesian model choice in the specific of Gaussian random fields (GRF), relying on a sufficient property only enjoyed by GRFs to show that the approach was legitimate. Despite having previously suggested the use of ABC for model choice in a wider range of models in the DIYABC software (Cornuet et al., 2008, Bioinformatics), we present in Robert et al. (2011, PNAS) evidence that the general use of ABC for model choice can be a real problem. Finally, in Marin et al. (2014, JRSS B), we derive necessary and sufficient conditions on summary statistics for the corresponding Bayes factor to be convergent, namely to asymptotically select the true model. In this talk, we will present these different results.

Marin, Pillai, Robert and Rousseau (2014) Relevant statistics for Bayesian model choice, to appear in the Journal of the Royal Statistical Society, Series B

Robert, Cornuet, Marin and Pillai (2011) Lack of confidence in approximate Bayesian computation model choice, Proceedings of the National Academy of Science, 108(37), 15112-15117

Grelaud, Robert, Marin, Rodolphe and Taly (2009) ABC likelihood-free methods for model choice in Gibbs random fields, Bayesian Analysis, 4(2), 317-336

Cornuet, Santos, Beaumont, Robert, Marin, Balding, Guillemaud and Estoup (2008) Inferring population history with DIY ABC: a user-friendly approach Approximate Bayesian Computation, Bioinformatics, 24(23), 2713-2719

MCMW01 23rd April 2014
10:30 to 11:05
Speeding up MCMC by Efficient Data Subsampling
Co-authors: Chris carter (University of New South wales ), Eduardo Mendes (University of New South wales )

The computing time for Markov Chain Monte Carlo (MCMC) algorithms can be prohibitively large for datasets with many observations, especially when the data density for each observation is costly to evaluate. We propose a framework based on a Pseudo-marginal MCMC where the likelihood function is unbiasedly estimated from a random subset of the data, resulting in substantially fewer density evaluations. The subsets are selected using efficient sampling schemes, such as Probability Proportional-to-Size (PPS) sampling where the inclusion probability of an observation is proportional to an approximation of its contribution to the likelihood function. We illustrate the method on a large dataset of Swedish firms containing half a million observations.

MCMW01 23rd April 2014
11:05 to 11:40
Establishing some order amongst exact approximation MCMCs
Co-author: Christophe Andrieu (University of Bristol)

Exact approximation Markov chain Monte Carlo (MCMC) algorithms are a general class of algorithms for Bayesian inference in complex models. We discover a general sufficient condition which allows to order two implementations of such algorithms in terms of mean acceptance probability and asymptotic variance. The key condition is convex order between the weight distributions, which emerges naturally when the weight distributins stem from importance sampling approximations with different number of samples.

MCMW01 23rd April 2014
11:40 to 12:15
The Bernoulli Factory, extensions and applications
The celebrated Bernoulli Factory problem can be phrased as following: Given a black box to simulate a p-coin with unknown p, and given known function f, how to design an algorithm for sampling an f(p)-coin? Solving the Bernoulli Factory involves some elegant and enjoyable mathematics and somewhat surprisingly the problem shows up as a major burden in many computational settings, such as executing certain types of MCMC algorithms. I will discuss some of these applications as well as extensions of the Bernoulli Factory to more general random variables.
MCMW01 23rd April 2014
13:45 to 14:20
Sequential Monte Carlo methods for applications in Data Assimilation
Sequential Monte Carlo (SMC) methods are nowadays routinely applied in a variety of complex applications: hidden Markov models, dynamical systems, target tracking, control problems, just to name a few. Whereas SMC have been greatly refined in the last decades and are now much better understood, they are still known to suffer from the curse of dimensionality: algorithms can sometimes break down exponentially fast with the dimension of the state space. As a consequence, practitioners in high-dimensional Data Assimilation applications in atmospheric sciences, oceanography and elsewhere will typically use 3D-Var or Kalman-filter-type approximations that will provide biased estimates in the presence of non-linear model dynamics.

The talk will concentrate on a class of SMC algorithms and will look at ways to reduce the cost of the algorithms as a function of the dimension of the state space. Explicit asymptotic results will clarify the effect of the dimension at the properties of the algorithm and could provide a platform for algorithmic optimisation in high dimensions. Applications will be shown in the context of Data Assimilation, in a problem where the objective is to target the posterior distribution of the initial condition of the Navier-Stokes equation given a Gaussian and noisy observations at different instances and locations of the spatial field. The dimension of the signal is in theory infinite-dimensional - in practice 64x64 or more depending on the resolution – thus posing great challenges for the development and efficiency of SMC methods.

MCMW01 23rd April 2014
14:20 to 14:55
Statistical Methods for Ambulance Fleet Management
We introduce statistical methods to address two problems arising in the management of ambulance fleets: (1) predicting the distribution of ambulance driving time between arbitrary start and end locations in a road network; and (2) space-time forecasting of ambulance demand. These predictions are critical for deciding how many ambulances should be deployed at a given time and where they should be stationed, which ambulance should be dispatched to an emergency, and whether and how to schedule ambulances for non-urgent patient transfers. We demonstrate the accuracy and operational impact of our methods using ambulance data from Toronto Emergency Medical Services.

For driving time prediction the relevant data are Global Positioning System (GPS) recordings from historical lights-and-sirens ambulance trips. Challenges include the typically large size of the road network and dataset (70,000 network links and 160,000 historical trips for Toronto); the lack of trips in the historical data that follow precisely the route of interest; and the strong temporal, weather, and other effects on driving time. We introduce a model of the travel time at the trip level, as well as a computationally efficient procedure for two-stage estimation in that model. For space-time forecasting of demand we develop integer time-series factor models and spatio-temporal mixture models, which capture the complex weekly and daily patterns in demand as well as changes in the spatial demand density over time.

Related Links: •http://people.orie.cornell.edu/woodard/publications.html - Links to Woodard's publications on ambulance fleet management

MCMW01 23rd April 2014
14:55 to 15:30
Adaptive delayed-acceptance pseudo-marginal random walk Metropolis
Co-authors: Andrew Golightly (Newcastle University) and Daniel Henderson (Newcastle University)

Delayed-acceptance (DA) pseudo-marginal Metropolis-Hastings (MH) algorithms can be applied when it is computationally expensive to calculate an unbiased estimate of the true posterior, but a computationally cheap approximation is available. A first accept-reject stage is applied, with the cheap approximation substituted for the true posterior in the MH acceptance ratio. Only for those proposals which pass through the first stage is the computationally expensive true posterior (or unbiased estimate thereof) evaluated, with a second accept-reject stage ensuring that detailed balance is satisfied with respect to the intended posterior. A weighted average of all previous unbiased estimates of the true posterior provides a generic approximation to the true posterior. If only the k-nearest neighbours are used in the average then evaluation of the approximate posterior can be made computationally cheap provided that the points at which the posterior has been estimated unbiasedly are stored in a multi-dimensional binary tree, similar to a KD-tree.

MCMW01 23rd April 2014
15:50 to 16:25
R Douc Identifiability conditions for partially-observed Markov chains
Co-authors: François ROUEFF (LTCI, UMR 5141, Telecom Paristech, France), Tepmony SIM (LTCI, UMR 5141, Telecom Paristech, France)

This paper deals with a parametrized family of partially-observed bivariate Markov chains. We establish that the limit of the normalized log-likelihood is maximized when the parameter belongs to the equivalence class of the true parameter, which is a key feature for obtaining consistency of the Maximum Likelihood Estimators in well-specified models. A novel aspect of this work is that geometric ergodicity of the Markov chain associated to the complete data, or exponential separation on measures are no more needed provided that the invariant distribution is assumed to be unique, regardless its rate of convergence to the equilibrium. The result is presented in a general framework including both fully dominated or partially dominated models as Hidden Markov models or Observation-Driven times series of counts.

MCMW01 23rd April 2014
16:25 to 17:00
Stochastic Gradient Langevin Dynamics for Large Scale Bayesian Inference
The Bayesian approach to statistical machine learning is a theoretically well-motivated framework to learning from data. It provides a coherent framework to reasoning about uncertainties, and an inbuilt protection against overfitting. However, computations in the framework can be expensive, and most approaches to Bayesian computations do not scale well to the big data setting. In this talk we propose a new computational approach for Bayesian learning from large scale datasets based on iterative learning from small mini-batches. By adding the right amount of noise to a standard stochastic gradient optimization algorithm we show that the iterates will converge to samples from the true posterior distribution as we anneal the stepsize. We apply the method to logistic regression and latent Dirichlet allocation, showing state-of-the-art performance.

Joint work with Max Welling and Sam Patterson.

MCMW01 24th April 2014
09:15 to 10:15
C Holmes Robust statistical decisions under model misspecification by re-weighted Monte Carlo samplers
Large complex data sets typically demand approximate models at some level of specification. In such situations it is important for the analyst to examine the robustness of conclusions to approximate predictions. Recent developments in optimal control and econometrics have established formal methods for linear quadratic state space models (see e.g. Hansen and Sargent 2008) by considering the local-minimax outcome within an information divergence (Kullback-Leibler) neighbourhood around the approximating model. Here we show how these approaches can be extended to arbitrary probability models using Monte Carlo methods. We derive theoretical results establishing the uniqueness of the Kullback-Leibler criteria, as well as Bayesian non-parametric methods to sample from the space of probability distributions within a fixed divergence constraint.
MCMW01 24th April 2014
10:30 to 11:05
Particle islands and archipelagos: some large sample theory
Co-authors: Christelle Vergé, Pierre Del Moral, and Eric Moulines

This talk discusses parallelisation of sequential Monte Carlo methods via the particle island framework (Vergé et al., 2013) and presents some novel convergence results for methods of this sort. More specifically, we introduce the concept of weighted archipelagos (i.e. sets of weighted particle islands, where each island is itself a weighted sample of particles) and define three different operations on such archipelagos, namely: selection on the island level, selection on the particle level, and mutation. We then establish that these operations preserve a set of convergence properties, including asymptotic normality, of the archipelago as the number of islands as well as the number of particles of each island tend jointly to infinity. Moreover, we provide recursive formulas for the asymptotic variance associated with each operation. As our results allow arbitrary compositions of the mentioned operations to be analysed, we may use the same for establishing the convergence properties of not only the double bootstrap algorithm but also generalisations of this algorithm.

MCMW01 24th April 2014
11:05 to 11:40
Bayesian inference for sparsely observed diffusions
Co-authors: Chris Sherlock (Lancaster University)

We consider Bayesian inference for parameters governing nonlinear multivariate diffusion processes using data that may be incomplete, subject to measurement error and observed sparsely in time. We adopt a high frequency imputation approach to inference, by introducing additional time points between observations and working with the Euler-Maruyama approximation over the induced discretisation. We assume that interest lies in the marginal parameter posterior and sample this target via particle MCMC. A vanilla implementation based on a bootstrap filter is eschewed in favour of an auxiliary particle filter where the latent path is extended by sampling a discretisation of a conditioned diffusion. This conditioned diffusion should be carefully constructed to allow for nonlinear dynamics between observations.

MCMW01 24th April 2014
11:40 to 12:15
Particle filters and curse of dimensionality
Co-author: Ramon van Handel (Princeton University)

A problem that arises in many applications is to compute the conditional distributions of stochastic models given observed data. While exact computations are rarely possible, particle filtering algorithms have proved to be very useful for approximating such conditional distributions. Unfortunately, the approximation error of particle filters grows exponentially with dimension, a phenomenon known as curse of dimensionality. This fact has rendered particle filters of limited use in complex data assimilation problems that arise, for example, in weather forecasting. In this talk I will argue that it is possible to develop “local” particle filtering algorithms whose approximation error is dimension-free. By exploiting conditional decay of correlations properties of high-dimensional models, we prove for the simplest possible algorithm of this type an error bound that is uniform both in time and in the model dimension. (Joint work with R. van Handel)

MCMW01 24th April 2014
13:45 to 14:20
Pre-Processing for Approximate Bayesian Computation in Image Analysis
Co-authors: Matthew Moores (QUT, Australia), Christian Robert (U. Paris Dauphine, France)

Existing algorithms for approximate Bayesian computation (ABC) assume that it is feasible to simulate pseudo-data from the model at each iteration. However, the computational cost of these simulations can be prohibitive for high dimensional data. An important example is the Potts model, which is commonly used in image analysis. The dimension of the state vector in this model is equal to the size of the data, which can be millions of pixels. We introduce a preparatory computation step before model fitting to improve the scalability of ABC. The output of this precomputation can be reused across multiple datasets. We illustrate this method by estimating the smoothing parameter for satellite images, demonstrating that the pre-computation step can sufficiently reduce the average runtime required for model fitting to enable analysis in realistic, if not yet real, time.

MCMW01 24th April 2014
14:20 to 14:55
D Prangle Lazy ABC
In approximate Bayesian computation (ABC) algorithms, parameter proposals are accepted if corresponding simulated datasets are sufficiently close to the observations. Producing the large quantity of model simulations needed requires considerable computer time. However, it is often clear early on in a simulation that it is unlikely to produce a close match. This talk is on an ABC algorithm which saves time by abandoning such simulations early. A probabilistic stopping rule is used which leaves the target distribution unchanged from that of standard ABC. Applications of this idea beyond ABC are also discussed.
MCMW01 24th April 2014
14:55 to 15:30
Parallel Markov Chain Monte Carlo
Co-author: Doug VanDerwerken (Duke University)

Markov chain Monte Carlo is an inherently serial algorithm. Although the likelihood calculations for individual steps can sometimes be parallelized, the serial evolution of the process is widely viewed as incompatible with parallization, offering no speedup for sampling algorithms which require large numbers of iterations to converge to equilibrium. We provide a methodology for parallelizing Markov chain Monte Carlo across large numbers of independent, asynchronous processors. The method is originally motivated by sampling multimodal target distributions, where we see an exponential speed-up in running time. However we show that the approach is general purpose and applicable to all Markov chain Monte Carlo simulations, and demonstrate speed-ups proportional to the number of available processors on slowly mixing chains with unimodal target distributions. The approach is simple and easy to implement, and suggests additional directions for further research.

MCMW01 24th April 2014
15:50 to 16:25
S Vollmer Consistency and CLTs for stochastic gradient Langevin dynamics based on subsampled data
Co-authors: Alexandre Thiery (National University of Singapore), Yee-Whye Teh (University of Oxford)

Applying MCMC to large data sets is expensive. Both calculating the acceptance probability and creating informed proposals depending on the likelihood require an iteration through the whole data set. The recently proposed Stochastic Gradient Langevin Dynamics (SGLD) circumvents this problem by generating proposals based on only a subset of the data and skipping the accept-reject step. In order to heuristically justify the latter, the step size converges to zero in a non-summable way.

Under appropriate Lyapunov conditions, we provide a rigorous foundation for this algorithm by showing consistency of the weighted sample average and proving a CLT for it. Surprisingly, the fraction of the data subset selection does not have an influence on the asymptotic variance.

MCMW01 24th April 2014
16:25 to 17:00
Optimal filtering and the dual process
Co-author: Matteo Ruggiero (Turin)

We link optimal filtering for hiddenMarkov models to the notion of duality forMarkov processes.We show that when the signal is dual to a process that has two components, one deterministic and one a pure death process, and with respect to functions that define changes of measure conjugate to the emission density, the filtering distributions evolve in the family of finite mixtures of such measures and the filter can be computed at a cost that is polynomial in the number of observations. Special cases of our framework include the Kalman filter, and computable filters for the Cox–Ingersoll–Ross process and the one-dimensional Wright– Fisher process, which have been investigated before in the literature. The dual we obtain for the Cox– Ingersoll–Ross process appears to be new in the literature.

MCMW01 25th April 2014
09:50 to 10:25
M Stumpf Approximate Bayesian Inference for Stochastic Processes
Co-authors: Paul Kirk (Imperial College London), Angelique Ale (Imperial College London), Ann Babtie (Imperial College London), Sarah Filippi (Imperial College London), Eszter Lakatos (Imperial College London), Daniel Silk (Imperial College London), Thomas Thorn (University of Edinburgh)

We consider approximate Bayesian computation (ABC) approaches to model the dynamics and evolution of molecular networks. Initially conceived to cope with problems with intractable likelihoods, ABC has gained popularity over the past decade. But there are still considerable problems in applying ABC to real-world problems, some of which are shared with exact Bayesian inference, but some are due to the nature of ABC. Here we will present some recent advances that allow us to apply ABC sequential Monte Carlo (SMC) to real biological problems. The rate of convergence of ABC-SMC depends crucially on the schedule of thresholds, ?t, t=1,2,…,T, and the perturbation kernels used to generate proposals from the previous population of parameters. We show how both of these can be tuned individually, and jointly. Careful calibration of the ABC-SMC approach can result in a 10-fold reduction in the computational burden (or more). I will also provide an overview of an alternative approach where, rather than approximating the likelihood in an ABC framework, we provide approximations to the master equation that describes the evolution of the stochastic system, that go beyond the conventional linear noise approximation (LNA). This allows us to tackle systems with interesting dynamics", that are typically beyond the scope of the LNA, and we will show how to use such approaches in exact Bayesian inference procedures (including nested sampling and SMC).

MCMW01 25th April 2014
10:40 to 11:15
Sequential Monte Carlo methods for graphical models
Co-authors: Christian A. Naesseth (Linkoping University), Fredrik Lindsten (University of Cambridge)

We develop a sequential Monte Carlo (SMC) algorithm for inference in general probabilistic graphical model. Via a sequential decomposition of the PGM we find a sequence of auxiliary distributions defined on a monotonically increasing sequence of probability spaces. By targeting these auxiliary distributions using purpose built SMC samplers we are able to approximate the full joint distribution defined by the graphical model. Our SMC sampler also provides an unbiased estimate of the partition function (normalization constant) and we show how it can be used within a particle Markov chain Monte Carlo framework. This allows for better approximations of the marginals and for unknown parameters to be estimated. The proposed inference algorithms can deal with an arbitrary graph structure and the domain of the random variables in the graph can be discrete or continuous.

MCMW01 25th April 2014
11:15 to 11:50
Sequential Monte Carlo with Highly Informative Observations
Co-author: Pierre Del Moral (University of New South Wales)

We introduce a sequential Monte Carlo (SMC) method for sampling the state of continuous-time state-space models when observations are highly informative, a situation in which standard SMC methods can perform poorly. The most extreme case is where the observations are exact---of the state itself---and the problem is that of simulating diffusion bridges between given starting and ending states. The basic idea is to introduce a sequence of intermediate weighting and resampling steps between observation times, guiding particles towards the ending state. A few designs that have been useful in practice are given, and demonstrated on some applied problems that feature complex models.

MCMW01 25th April 2014
11:50 to 12:25
Generalised Particle Filters with Gaussian Mixtures
Stochastic filtering is defined as the estimation of a partially observed dynamical system. A massive scientific and computational effort has been dedicated to the development of numerical methods for approximating the solution of the filtering problem. Approximating with Gaussian mixtures has been very popular since the 1970s, however the existing work is only based on the success of the numerical implementation and is not theoretically justified.

We fill this gap and conduct a rigorous analysis of a new Gaussian mixture approximation to the solution of the filtering problem. In particular, we construct the corresponding approximating algorithm, deduce the L2-convergence rate and prove a central limit type theorem for the approximating system. In addition, we show a numerical example to illustrate some features of this algorithm. This is joint work with Dan Crisan (Imperial College London).

References: [1] D. Crisan, K. Li, “A central limit type theorem for Gaussian mixture approximations to the nonlinear filtering problem”, ArXiv1401:6592, (2014).

[2] D. Crisan, K. Li, “Generalised particle filters with Gaussian mixtures”, accepted by Stochastic Processes and their Applications, ArXiv1306:0255, (2013).

[3] D. Crisan, K. Li, “Generalised particle filters with Gaussian measures”, Proceedings of 19th European Signal Processing Conference, Barcelona, Spain, pp. 659-663, (2011).

MCMW01 25th April 2014
14:00 to 14:45
T Sharia Truncated stochastic approximation with moving bounds
A wide class of truncated stochastic approximation procedures with moving random bounds will be discussed. This class of procedures has three main characteristics: truncations with random moving bounds, a matrix-valued random step-size sequence, and a dynamically changing random regression function. While we believe that the proposed class of procedures will find its way to a wider range of applications, the main motivation is to accommodate applications to parametric statistical estimation theory. The proposed method allows for incorporation of auxiliary information into the estimation process, and is consistent and asymptotically efficient under certain regularity conditions.

MCMW01 25th April 2014
14:45 to 15:15
Asymptotic Properties of Recursive Maximum Likelihood Estimation in State-Space Models
Co-author: Arnaud Doucet (University of Oxford)

Recursive maximum likelihood algorithm for state-space models (i.e., for continuous state hidden Markov models) is an iterative estimation method based on particle filter and stochastic gradient search. In this talk, resent results on its asymptotic properties are presented. These results are focused on the asymptotic bias and the asymptotic variance. They also involve diffusion approximation, almost-sure and mean-square convergence of the recursive maximum likelihood algorithm. Some auxiliary (yet, rather interesting) results on the asymptotic properties of the particle filter and the log-likelihood are presented in the talk, too.

MCM 30th April 2014
11:30 to 12:30
A nested particle filter for online Bayesian parameter estimation in state-space systems
We address the problem of approximating the probability measure of the fixed parameters of a state-space dynamic system using a sequential Monte Carlo method (SMC). The proposed approach relies on a nested structure that employs two layers of particle filters to approximate the posterior probability law of the static parameters and the dynamic variables of the system of interest, in the vein of the recent SMC^2 algorithm. However, different from the SMC^2 scheme, the proposed algorithm operates in a purely recursive manner and the scheme for the rejuvenation of the particles in the parameter space is simpler. We show analytical results on the approximation of integrals of real bounded functions with respect to the posterior distribution of the system parameters computed via the proposed scheme. For a finite time horizon and under mild assumptions, we prove that the approximation errors vanish with the usual 1/?N rate, where N is the number of particles in the parameter space. Under a set of stronger assumptions related to (i) the stability of the optimal filter for the model, (ii) the compactness of the parameter and state spaces and (iii) certain bounds on the family of likelihood functions, we prove that the convergence of the approximation errors is uniform over time, and provide an explicit rate function. The uniform convergence result has some relevant consequences. One of them is that the proposed scheme can asymptotically identify the parameter values for a class of state-space models. A subset of the assumptions that yield uniform convergence also lead to a positive lower bound, uniform over time and the number of particles, on the normalized effective sample size the filter. We conclude with a simple numerical example that illustrates some of the theoretical findings
MCM 12th May 2014
14:00 to 15:00
Stochastic filtering - a brief historical account
MCM 13th May 2014
10:00 to 11:00
Bayesian Uncertainty Quantification for Differential Equations
This talk will make a case for expansion of the role of Bayesian statistical inference when formally quantifying uncertainty in computer models defined as systems of ordinary or partial differential equations by adopting the perspective that implicitly defined infinite dimensional functions representing model states are objects to be inferred probabilistically. I describe a general methodology for the probabilistic "integration" of differential equations via model based updating of a joint prior measure on the space of functions, their temporal and spatial derivatives. This results in a measure over functions reflecting how well they satisfy the system of differential equations and corresponding initial and boundary values. This measure can be naturally incorporated within the Kennedy and O'Hagan framework for uncertainty quantification and provides a fully Bayesian approach to model calibration and predictive analysis. By taking this probabilistic viewpoint, the full force of Bayesian inference can be exploited when seeking to coherently quantify and propagate epistemic uncertainty in computer models of complex natural and physical systems. A broad variety of examples are provided to illustrate the potential of this framework for characterising discretization uncertainty, including initial value, delay, and boundary value differential equations, as well as partial differential equations. I will also demonstrate the methodology on a large scale system, by modeling discretization uncertainty in the solution of the Navier-Stokes equations of fluid flow, reduced to over 16,000 coupled and stiff ordinary differential equations.
MCM 13th May 2014
14:00 to 15:00
Zero-Variance Hamiltonian MCMC
Interest is in evaluating, by Markov chain Monte Carlo (MCMC) simulation, the expected value of a function with respect to a, possibly unnormalized, probability distribution. A general purpose variance reduction technique for the MCMC estimator, based on the zero-variance principle introduced in the physics literature, is proposed. The main idea is to construct control variates based on the score function. Conditions for asymptotic unbiasedness of the zero-variance estimator are derived. A central limit theorem is also proved under regularity conditions. The potential of the zero-variance strategy is illustrated with real applications to probit, logit and GARCH Bayesian models. The Zero-Variance principle is efficiently combined with Hamiltonian Monte Carlo and Metropolis adjusted Langevin algorithms without exceeding the computational requirements since its main ingredient (namely the score function) is exploited twice: once to guide the Markov chain towards relevant portion of the state space via a clever proposal, that exploits the geometry of the target and achieves convergence in fewer iterations, and then to post-process the simulated path of the chain to reduce the variance of the resulting estimators.
MCM 15th May 2014
11:00 to 12:00
The MOP - Particles without Resampling
Under certain assumptions, it is demonstrated that it is possible to derive a particle filter for arbitrarily high dimensions where the weights on the particles are all equal and there is therefore never any need for resampling. Relaxing these assumptions (eg by using ideas from "particle flow" implemented using piecewise linear approximations to a non-linear function) results in particle filters that only require very infrequent resampling. Results are demonstrated on non-linear non-Gaussian problems.
MCM 15th May 2014
13:30 to 14:30
J Kominiarczuk Acyclic Monte Carlo: where sequential Monte Carlo meets renormalisation
AMC is a method of constructing the sequence of probability densities for SMC using ideas from renormalization, which can be seen as SMC done in the reverse.