Learning in high dimensions, noise, sparsity and treelets
Seminar Room 1, Newton Institute
In recent years there is growing practical need to perform learning (classification,regression, etc) in high dimensional settings where p>>n. Consequently instead of the standard limit $n\to\infty$, learning algorithms are typically analyzed in the joint limit $p,n\to\infty$. In this talk we present a different approach, that keeps $p,n$ fixed, but considers noise as a small parameter. This resulting perturbation analysis reveals the importance of a robust low dimensional representation of the noise-free signals, the possible failure of simple variable selection methods and the key role of sparsity for the success of learning in high dimensions. We also discuss sparsity in a-priori unknown basis and a possible data-driven adaptive construction of such basis, called treelets. We present a few applications of our analysis, mainly to error-in-variables linear regression problems, principal component analysis, and rank determination.
If it doesn't, something may have gone wrong with our embedded player.
We'll get it fixed as soon as possible.