Sequential stopping for high-throughput experiments
Seminar Room 1, Newton Institute
In high-throughput experiments sample size is typically chosen informally. Although formal sample size calculations have been proposed, they depend critically on prior knowledge. We propose a sequential strategy which, by updating knowledge when new data is available, depends less critically on prior assumptions. Compared to fixed sample size approaches, our sequential strategy stops experiments early when enough evidence has been accumulated, and recommends continuation when additional data is likely to provide valuable information. The approach is based on a decision-theoretic framework, guaranteeing that the chosen design proceeds in a coherent fashion. We propose a utility function based on the number of true positives which is straightforward to specify and intuitively appealing. As for most sequential design problems, an exact solution is computationally prohibitive. To address the computational challenge and also to limit the dependence on an arbitrarily chosen utility function we propose instead a simulation-based approximation with decision boundaries. The approach allows us to determine good designs within reasonable computing time and is characterized by intuitively appealing decision boundaries. We apply the method to next-generation sequencing, microarray and reverse phase protein array studies. We show that it can lead to substantial increases in posterior expected utility. An implementation of the proposed approach is available in the Bioconductor package gaga.