skip to content

Candidates vs. Noises Estimation for Large Multi-Class Classification Problem

Presented by: 
Tong Zhang
Thursday 28th June 2018 - 09:00 to 09:45
INI Seminar Room 1
In practice, there has been sigificant interest in multi-class classification problems where the number of classes is large. Computationally such applications require statistical methods with run time sublinear in the number of classes. A number of methods such as Noise-Contrastive Estimation (NCE) and variations have been proposed in recent years to address this problem. However, the existing methods are not statistically efficient compared to multi-class logistic regression, which is the maximum likelihood estimate. In this talk, I will describe a new method called Candidate v.s. Noises Estimation (CANE) that selects a small subset of candidate classes and samples the remaining classes. We show that CANE is always consistent and computationally efficient. Moreover, the resulting estimator has low statistical variance approaching that of the maximum likelihood estimator, when the observed label belongs to the selected candidates with high probability. Extensive experimental results show that CANE achieves better prediction accuracy over a number of the state-of-the-art tree classifiers, while it gains significant speedup compared to standard multi-class logistic regression.
The video for this talk should appear here if JavaScript is enabled.
If it doesn't, something may have gone wrong with our embedded player.
We'll get it fixed as soon as possible.
University of Cambridge Research Councils UK
    Clay Mathematics Institute London Mathematical Society NM Rothschild and Sons