Presented by:
Thomas Steinke
Date:
Tuesday 22nd November 2016 - 15:30 to 16:30
Venue:
INI Seminar Room 2
Abstract:
Adaptivity is an important aspect of data analysis --
that is, the choice of questions to ask about a dataset is often informed by
previous use of the same dataset. However, statistical validity is typically
only guaranteed in a non-adaptive model, in which the questions must be specified
before the dataset is collected. A recent line of work initiated by Dwork et
al. (STOC 2015) provides a formal model for studying the power of adaptive data
analysis.
This talk will show that there are sophisticated
techniques -- using tools from information theory and differential privacy --
that enable us to ensure that adaptive data analysis provides statistically
valid answers that generalise to the overall population from which the dataset
was drawn. This talk will also discuss how adaptive data analysis is inherently
more powerful than non-adaptive data analysis, namely there is an exponential
separation between the number of adaptive queries needed to overfit a dataset
and the number of non-adaptive queries needed.
The video for this talk should appear here if JavaScript is enabled.
If it doesn't, something may have gone wrong with our embedded player.
We'll get it fixed as soon as possible.
If it doesn't, something may have gone wrong with our embedded player.
We'll get it fixed as soon as possible.