Recent Advances in Monte Carlo Based Inference
Monday 30th October 2006 to Friday 3rd November 2006
08:30 to 10:00  Registration  INI 1  
10:00 to 10:05 
D Wallace ([Newton Institute]) Welcome address 
INI 1  
10:05 to 11:00 
S Tavaré (University of Cambridge) Now you know your ABCs: examples and problems 
INI 1  , 
11:00 to 11:30  Coffee  
11:30 to 12:30 
T Pettitt ([Queensland]) From doubly intractable distributions via auxiliary variables to likelihood free inference 
INI 1  , 
12:30 to 13:30  Lunch at Churchill College  
14:00 to 15:15 
D Balding ([Imperial]) Some developments of ABC Approximate Bayesian computation based on parameter sets simulated under the prior that are accepted or rejected according to whether a simulated dataset resembles the observed data, has become a widelyused tool in population genomic studies since Pritchard et al (1999), and its use is growing in other areas. Developments of the basic idea have involved regression adjustment of the accepted values to mitigate the effects of discrepancies between simulated and observed datasets (Beaumont et al 2002) and embedding the approximation within a MetropolisHastings algorithm to create "likelihoodfree" MCMC. We review these and more recent developments, for example based on sequential Monte Carlo and various adaptive simulation schemes. 
INI 1  , 
15:15 to 15:45  Tea  
15:45 to 17:00 
A Frigessi ([Norwegian Computing Center]) Estimating functions in indirect inference There are models for which the evaluation of the likelihood is infeasible in practice. For these models the MetropolisHastings acceptance probability cannot be easily computed. This is the case, for instance, when only departure times from a G/G/1 queue are observed and inference on the arrival and service distributions are required. Indirect inference is a method to estimate a parameter theta in models whose likelihood function does not have an analytical closed form, but from which random samples can be drawn for fixed values of theta. First an auxiliary model is chosen whose parameter beta can be directly estimated. Next, the parameters in the auxiliary model are estimated for the original data, leading to an estimate betahat. The parameter beta is also estimated by using several sampled data sets, simulated from the original model for different values of the original parameter theta. Finally, the parameter theta which leads to the best match to betahat is chosen as the indirect inference estimate. We analyse which properties an auxiliary model should have to give satisfactory indirect inference. We look at the situation where the data are summarized in a vector statistic T, and the auxiliary model is chosen so that inference on â is drawn from T only. Under appropriate assumptions the asymptotic covariance matrix of the indirect estimators is proportional to the asymptotic covariance matrix of T and componentwise inversely proportional to the square of the derivative, with respect to theta, of the expected value of T. We discuss how these results can be used in selecting good estimating functions. We apply our findings to the queuing problem. Related Links 
INI 1  , 
17:00 to 18:00  Welcome Wine Reception  
18:15 to 18:45  Dinner at New Hall (Residents Only) 
09:00 to 10:00 
C Holmes (University of Oxford) Populationbased MC for sampling transdimensional Bayesian regression models 
INI 1  
10:00 to 11:00 
A Doucet ([British Columbia]) Sequentially interacting Markov Chain Monte Carlo We introduce a methodology to sample from a sequence of probability distributions and estimate their unknown normalizing constants. This problem is traditionally addressed using Sequential Monte Carlo (SMC) methods which rely on importance sampling/resampling ideas. We design here an alternative iterative algorithm. This algorithm is based on a sequence of interacting Markov chain Monte Carlo (MCMC) algorithms. We establish the convergence of this nonMarkovian scheme and demonstrate this methodology on various examples arising in Bayesian inference. (This is a joint work with Anthony E. Brockwell, Department of Statistics, Carnegie Mellon) 
INI 1  
11:00 to 11:30  Coffee  
11:30 to 12:30 
O Cappe ([ENSTTSI]) Adaptive population Monte Carlo In (Cappé et al., 2004) we proposed a simulation scheme termed Population Monte Carlo which may be viewed as an Iterated Importance Sampling approach, with resampling and Markovian instrumental simulations. This scheme also is a particular case of the Sequential Monte Carlo Sampling approach of (Delmoral et al., 2006). I will discuss the Population Monte Carlo approach, focussing on the case where the target distribution is held fixed and the importance kernels are adapted during the iterations in order to optimize a performance criterion. In the case where the importance kernel is composed of a mixture of fixed kernels, the mixture weights can be adapted using update rules which are remarkably stable and have interesting connections with the ExpectationMaximization algorithm. This talk is based on work done with (or by) several of my colleague in Paris  Arnaud Guillin, JeanMichel Marin, Christian P. Robert (Cérémade) and Randal Douc (École Polytechnique)  as well as on ideas related to the ECOSSTAT project. 
INI 1  , 
12:30 to 13:30  Lunch at Churchill College  
14:00 to 15:15 
Z Ghahramani (University of Cambridge) Deterministic alternatives to MCMC MCMC provides a powerful set of tools for inference in Bayesian models. However, for many applications to large scale problems, MCMC methods can be relatively inefficient compared to new deterministic approximations developed in the machine learning community. I will describe several modern and generally applicable deterministic algorithms for approximate inference, and mention their advantages and disadvantages compared to MCMC. To focus the talk, I will describe all algorithms in the context of inference in Dirichlet process mixtures (DPMs), a classical nonparametric Bayesian statistical model used to define mixture models with countably infinitely many components. In particular, I will cover the following algorithms for inference in DPMs: (1) the traditional Gibbs sampling MCMC algorithm, (2) Variational Bayesian (VB) approximations, (3) the Expectation Propagation (EP) algorithm, and (4) a new approximate inference method for DPMs based on Bayesian hierarchical clustering (BHC). All these algorithms provide different speed / accuracy tradeoffs for largescale problems, and the underlying concepts can be applied to virtually any statistical model. My conclusion is the following: MCMC is but one of many ways of approaching intractable inference problems, and modern statistics is likely to benefit by broadening the toolbox to include novel inference methods arising from other communities. Joint work with: Matthew Beal (U Buffalo), Katherine Heller (UCL), and Tom Minka (Microsoft). Related Links

INI 1  , 
15:15 to 15:45  Tea  
15:45 to 17:00  Poster session  
18:15 to 18:45  Dinner at New Hall (Residents Only) 
09:00 to 10:00 
Y Iba ([ISM, Tokyo]) Applications of extended ensemble Monte Carlo "Extended Ensemble Monte Carlo" is proposed as a generic term which indicates methods such as Parallel Tempering and Multicanonical Monte Carlo. These methods sample a composition or extension of given distributions, and provide a simple and practical way to attack slow mixing and ergodicity problems. The purpose of the talk is not to present a novel algorithm, but explore variations and applications of the algorithms, including their use in statistical physics, combinatorics, and rare event sampling. It is stressed that overcoming slow mixing is a key for the extension of application fields of MCMC. 
INI 1  , 
10:00 to 11:00 
D Leslie ([Bristol]) Sequential Monte Carlo for Generalized Linear Mixed Models Sequential Monte Carlo methods for static problems combine the advantages of importance sampling and Markov chain based methods. We demonstrate how to use these exciting new techniques to fit generalised linear mixed models. A normal approximation to the likelihood is used to generate an initial sample, then transition kernels, reweighting and resampling result in evolution to a sample from the full posterior distribution. Since the technique does not rely on any ergodicity properties of the transition kernels, we can modify these kernels adaptively, resulting in a more efficient sampler. 
INI 1  , 
11:00 to 11:30  Coffee  
11:30 to 12:30 
T Johnson ([Edinburgh]) A sequential importance sampler for reconstructing genetic pedigrees 
INI 1  , 
12:30 to 13:30  Lunch at Churchill College  
14:00 to 15:15 
G Roberts ([Lancaster]) Retrospective sampling Fuelled by the proliferation of large and complex data sets, statistical modelling has become increasingly ambitious. In fact, the use of models parameterised by an infinite number of parameters or latent variables is increasingly common. This is particularly appealing when models are most naturally formulated within an infinite dimensional setting, for instance continuous timeseries. It is wellunderstood that to obtain widely applicable statistical methodology for flexible families of complex models, requires powerful computational techniques (such as MCMC, SMC, etc). Unfortunately, infinite dimensional simulation is not usually feasible, so that the use of these computational methods for infinitedimensional models is often only possible by adopting some kind of finite dimensional approximation to the chosen model. This is unappealing since the impact of the approximation is often difficult to assess, and the procedure often involves essentially using an approximate finitedimensional model. The talk will discuss a general technique for simulation which attempts to work directly with infinite dimensional random variables without truncation nor approximation. The talk is illustrated by concrete examples from the simulation of diffusion processes and Bayesian inference for Dirichlet mixture models. One surprising feature of the methodology is that exact infinitedimensional algorithms are commonly far more efficient than approximate models 
INI 1  
15:15 to 15:45  Tea  
15:45 to 17:00 
C Andrieu ([Bristol]) The expected auxiliary variable method for Monte Carlo simulation The expected auxiliary variable method is a general framework for Monte Carlo simulation in situations where quantities of interest are untractable and prevent the implementation of classical methods. The method finds application in situations where marginal computations are of interest, transdimensional move design is difficult in model selection setups, when the normalising constant of a particular distribution is unknown but required for exact computations. I will present several examples of applications of this principle as well as some theoretical results that we have recently obtained in some specific scenarios. 
INI 1  
18:15 to 18:45  Dinner at New Hall (Residents Only) 
09:00 to 10:00 
N Chopin ([Bristol]) Extensions of the CE method for statistical analysis The CrossEntropy method is a new Monte Carlo paradigm pioneered by Rubinstein (1999) in Operation Research. Its primary applications are (i) the calculation of probability of rare events, and (ii) the optimisation of irregular, multimodal functions. While these two objectives seem to have a little in common, the CE approach manages to express them in a similar framework. In this talk, we will explain how Statistics can benefit from the CE method, and how the CE method can also benefit in turn from Statistical methodology. We will discuss the following particular applications in Statistics: MonteCarlo pvalues, simulation of truncated distributions, variable selection, and mixture estimation. We will see that in each case CE provides significant improvements over current methods. Interestingly, we will see also vanilla CE rarely works directly, but tandard tools from Statistical Inference allow for developing more efficient algorithms. In particular, we will discuss a CEEM algorithm for mixture estimation, which outperform any straight CE or EM algorithm, in terms, of finding higher modes of the likelihood function. 
INI 1  , 
10:00 to 11:00 
D Mackay (University of Cambridge) Nested sampling Nested sampling is a new Monte Carlo algorithm invented by John Skilling. Whereas most Monte Carlo methods aim to generate samples from a posterior or to estimate posterior expectations, nested sampling's central aim is to evaluate the evidence (the normalizing constant, also known as the marginal likelihood or partition function). This important quantity can be computed by standard Monte Carlo methods (such as Gibbs sampling) only by adding extra computations (such as reversiblejump Monte Carlo or thermodynamic integration) which require careful tuning. I will review nested sampling and describe tests of the method on graphical models. (Joint work with Iain Murray and John Skilling) Related Links

INI 1  , 
11:00 to 11:30  Coffee  
11:30 to 12:30 
D Crisan ([Imperial]) Sequential Monte Carlo methods: can we replace the resampling step? Sequential Monte Carlo methods provide reliable approximations of the conditional ditribution of a certain signal process given the data obtined from an associated observation process. The generic SMC method involves sampling from the prior distribution of the signal and then using a weighted bootstrap technique (or equivalent) with weights defined by the likelihood of the most recent observation data. If the number of updating stages becomes large, the repeated application of the the weighted bootstrap may lead to what the literature describes as "impoverished sample" or "sample attrition". That means that the sample being carried forward will have fewer and fewer distict values. In this talk, I propose a method that attempts to solve this problem for the continuous time filtering problem. The method replaces the sampling from the prior step with sampling from a distribution that depends on the entire (existing) sample and the most recent observation data and it does not contain the weighted bootstrap step. The method is motivated by recent advances in the area of McKeanVlasov representations for solutions of stochastic PDEs and their application to solving the filtering problem in a continuous time framework. 
INI 1  , 
12:30 to 13:30  Lunch at Churchill College  
14:00 to 15:15 
O Papaspiliopoulos ([Lancaster]) Importance sampling for diffusion processes In this talk I will discuss various techniques for constructing importance sampling estimators which approximate expectations (e.g the likelihood function, filtering densities, etc) when modelling with diffusion processes. 
INI 1  , 
15:15 to 15:45  Tea  
15:45 to 17:00 
D Wilkinson ([Newcastle]) Bayesian inference for nonlinear multivariate diffusion processes In this talk I will give an overview of the problem of conducting Bayesian inference for the fixed parameters of nonlinear multivariate diffusion processes observed partially, discretely, and possibly with error. I will present a sequential strategy based on either SIR or MCMCbased filtering for approximate diffusion bridges, and a "global" MCMC algorithm that does not degenerate as the degree of data augmentation increases. The relationship of these techniques to methods of approximate Bayesian computation will be highlighted. 
INI 1  , 
18:00 to 19:30  CUP Bookshop Reception  
20:00 to 18:00  Conference Dinner at Peterhouse (Dining Hall) 
09:00 to 10:00 
D Frenkel ([FOMInstitute]) ConfigurationallyBiased MC and Virtualmove parallel tempering Parallel tempering is a Monte Carlo scheme that was introduced by Geyer in 1991 as a tool to boost the efficiency of Monte Carlo sampling Mendelian genetics. The technique has found widespread application in statistical mechanics. I will describe how, in some cases, the efficiency of the method may be increased by including information about MC trial moves that are excluded from the Markov chain. If time allows, I will discuss Biased Monte Carlo schemes that allow us to sample the conformations of composite objects, such as polymers. 
INI 1  , 
10:00 to 11:00 
P Green ([Bristol]) Branching process Monte Carlo This talk is an exploration of the possible role for branching processes and related models in Monte Carlo simulation from a complex distribution, such as a Bayesian posterior. The motivation is that branching processes can support antithetic behaviour in a natural way by making offspring negatively correlated, and also that branching paths may assist in navigating past slowlymixing parts of the state space. The basic theory of branching processes as used for sampling is established, including the appropriate analogue of global balance with respect to the target distribution, evaluation of moments, in particular asymptotic variances, and a start on the spectral theory. Although our model is a kind of 'population Monte Carlo', it should be noted that it has virtually nothing to do with particle filters, etc. Our target is not sequentially evolving, and we rely on ergodicity for convergence of functionals of the target distribution, rather than using importance sampling. This is joint work with Antonietta Mira (University of Insubria, Varese, Italy). 
INI 1  , 
11:00 to 11:30  Coffee  
12:30 to 13:30  Lunch at Churchill College  
14:00 to 15:15 
D Wales (University of Cambridge) Sampling the energy landscape: thermodynamics and rates Stationary points of the potential energy surface provide a natural way to coarsegrain calculations of thermodynamics and kinetics, as well as a framework for basinhopping global optimisation. Thermodynamic properties can be obtained from samples of local minima using the basinsampling approach, and kinetic information can be extracted if the samples are extended to include transition states. Using statistical rate theory a minimumtominimum rate constant can be associated with each transition state, and phenomenological rates between sets of local minima that define thermodynamic states of interest can be calculated using a new graph transformation approach. Since the number of stationary points grows exponentially with system size a sampling scheme is required to produce representative pathways. The discrete path sampling approach provides a systematic way to achieve this objective once a single connected path between products and reactants has been located. In large systems such paths may involve dozens of stationary points of the potential energy surface. New algorithms have been developed for both geometry optimisation and making connections between distant local minima, which have enabled rates to be calculated for a wide variety of systems. 
INI 1  , 
15:15 to 15:45  Tea  
15:45 to 17:00 
P L'Ecuyer ([Montreal]) Randomized quasiMonte Carlo for Markov Chains QuasiMonte Carlo (QMC) methods are numerical techniques for estimating largedimensional integrals, usually over the unit hypercube. They can be applied, at least in principle, to any stochastic simulation whose aim is to estimate a mathematical expectation. This covers a wide range of applications. Practical error bounds are hard to obtain with QMC but randomized quasiMonte Carlo (RQMC) permits one to compute an unbiased estimator of the integral, together with a confidence interval. RQMC can in fact be viewed as a variancereduction technique. In this talk, we review some key ideas of RQMC methods and provide concrete examples of their application to simulate systems modeled as Markov chains. We also present a new RQMC method, called arrayRQMC, recently introduced to simulate Markov chains over a large number of steps. Our numerical illustrations indicate that RQMC can dramatically reduce the variance compared with standard Monte Carlo. 
INI 1  , 
18:15 to 18:45  Dinner at New Hall (Residents Only) 