09:00 to 10:00 J-L Wang ([UC Davis])Covariate adjusted functional principal component analysis for longitudinal dataChair: Sasha Tsybakov Classical multivariate principal component analysis has been extended to functional data and termed Functional principal component analysis (FPCA). Much progress has been made but most existing FPCA approaches do not accommodate covariate information, and it is the goal of this talk to develop alternative approaches to incorporate covariate information in FPCA, especially for irregular or sparse functional data. Two approaches are studied: the first incorporates covariate effects only through the mean response function, but the second approach adjusts the covariate effects for both the mean and covariance functions of the response. Both new approaches can accommodate measurement errors and allow data to be sampled at regular or irregular time grids. Asymptotic results are developed and numerical support provided through simulations and a data example. A comparison of the two approaches will also be discussed. INI 1 10:00 to 11:00 V Koltchinskii (Georgia Institute of Technology)Penalized empirical risk minimization and sparse recovery problemsChair: Sasha Tsybakov A number of problems in regression and classification can be stated as penalized empirical risk minimization over a linear span or a convex hull of a given dictionary with convex loss and convex complexity penalty, such as, for instance, $\ell_1$-norm. We will discuss several oracle inequalities showing how the error of the solution of such problems depends on the "sparsity" of the problem and the "geometry" of the dictionary. INI 1 11:00 to 11:30 Coffee 11:30 to 12:30 P Wolfe ([Harvard])The Nystrom extension and spectral methods in learning: low-rank approximation of quadratic forms and productsChair: Sasha Tsybakov Spectral methods are of fundamental importance in statistics and machine learning, as they underlie algorithms from classical principal components analysis to more recent approaches that exploit manifold structure. In most cases, the core technical problem can be reduced to computing a low-rank approximation to a positive-definite kernel. Motivated by such applications, we present here two new algorithms for the approximation of positive semi-definite kernels, together with error bounds that improve upon known results. The first of thesebased on samplingleads to a randomized algorithm whereupon the kernel induces a probability distribution on its set of partitions, whereas the latter approachbased on sortingprovides for the selection of a partition in a deterministic way. After detailing their numerical implementation and verifying performance via simulation results for representative problems in statistical data analysis, we conclude with an extension of these results to the sparse representation of linear operators and the efficient approximation of matrix products. INI 1 12:30 to 13:30 Lunch at Wolfson Court 14:00 to 14:20 G Pan (EURANDOM)Limiting theorems for large dimensional sample means, sample covariance matrices and Hotelling's T2 statisticsChair: Doug Nychka It is well known that sample means and sample covariance matrices are independent if the samples are from the Gaussian distribution and are i.i.d.. In this talk, via investigating the random quardratic forms involving sample means and sample covariance matrices, we suggest the conjecture that the sample means and the sample covariance matrices under general distribution functions are asymptotically independent in the large dimensional case when the dimension of the vectors and the sample size both go to infinity with their ratio being a positive constant. As a byproduct, the central limit theorem for the Hotelling $T^2$ statistic under the large dimensional case is established. INI 1 14:20 to 14:40 JQ Shi ([Newcastle])Generalised gaussian process functional regression modelChair: Doug Nychka In this talk, I will discuss a functional regression problem with non-Gaussian functional (longitudinal) response with functional predictors. This type of problem includes for example binomial and Poisson response data, occurring in many bi-medical and engineering experiments. We proposed a generalised Gaussian process functional regression model for such regression situation. We suppose that there exists an underlying latent process between the inputs and the response. The latent process is defined by Gaussian process functional regression model, which is connected with stepwise response data by means of a link function. INI 1 14:40 to 15:00 Y Wang ([NSF])Estimation of large volatility matrix for high-frequency financial dataChair: Doug Nychka Statistical theory for estimating large covariance matrix shows that even for noiseless synchronized high-frequency financial data, the existing realized volatility based estimators of integrated volatility matrix of p assets are inconsistent, for large p (the number of assets and large n (the sample size for high-frequency data). This paper proposes new types of estimators of integrated volatility matrix for noisy non-synchronized high-frequency data. We show that when both n and p go to infinity with p/n approaching to a constant, the proposed estimators are consistent with good convergence rates. Our simulations demonstrate the excellent performance of the proposed estimators under complex stochastic volatility matrices. We have applied the methods to high-frequency data with over 600 stocks. INI 1 15:00 to 15:30 Tea 15:30 to 16:30 D Barber ([UC London])Graph decomposition for community identification and covariance constraintsChair: Doug Nychka An application in large databases is to find well-connected clusters of nodes in an undirected graph where a link represents interaction between objects. For example, finding tight-knit communities in social networks, identifying related product-clusters in collaborative filtering, finding genes which collaborate in different biological functions. Unlike graph-partitioning, in this scenario an object may belong to more than one community -- for example, a person might belong to more than one group of friends, or a gene may be active in more than one gene-network. I'll discuss an approach to identifying such overlapping communities based on extending the incidence matrix decomposition of a graph to a clique-decomposition. Clusters are then identified by approximate variational (mean-field) inference in a related probabilistic model. The resulting decomposition has the side-effect of enabling a parameteristion of positive definite matrices under zero-constraints on entries in the matrix. Provided the graph corresponding to the constraints is decomposable all such matrices are reachable by this parameterisation. In the non-decomposable case, we show how the method forms an approximation of the space and relate it to more standard latent variable parameterisations of zero-constrained covariances. INI 1 16:30 to 17:30 E Levina ([Michigan])Permutation-invariant covariance regularisation in high dimensionsChair: Doug Nychka Estimation of covariance matrices has a number of applications, including principal component analysis, classification by discriminant analysis, and inferring independence and conditional independence between variables, and the sample covariance matrix has many undesirable features in high dimensions unless regularized. Recent research mostly focused on regularization in situations where variables have a natural ordering. When no such ordering exists, regularization must be performed in a way that is invariant under variable permutations. This talk will discuss several new sparse covariance estimators that are invariant to variable permutations. We obtain convergence rates that make explicit the trade-offs between the dimension, the sample size, and the sparsity of the true model, and illustrate the methods on simulations and real data. We will also discuss a method for finding a "good" ordering of the variables when it is not provided, based on the Isomap, a manifold projection algorithm. The talk includes joint work with Adam Rothman, Amy Wagaman, Ji Zhu (University of Michigan) and Peter Bickel (UC Berkeley). INI 1 19:30 to 23:00 Conference Dinner - Lucy Cavendish College (Hall)