An Isaac Newton Institute Workshop

Stochastic Computation for the Analysis of Ecological and Epidemiological Data

Estimating mixing between subpopulations using respondent driven sampling

Authors: Art F.Y. Poon (UCSD), Sergei L. Kosakovsky Pond (UCSD), Simon D.W. Frost (UCSD)

Abstract

It is widely acknowledged that the level of mixing within a population plays an important role in the transmission dynamics of infectious diseases. However, obtaining information on mixing is notoriously difficult. Respondent-driven sampling (RDS), a kind of chain-referral sampling, is becoming an increasingly popular approach of sampling 'hidden' populations, such as injection drug users and men who have sex with men. RDS involves giving study participants a small number of coupons to give to other potential participants who are their friends or acquaintances. As a side-effect of the recruitment process, RDS provides information on mixing between different populations and, by asking individuals about their relationship to the person that recruited them, the extent of overlap between social and sexual networks. Current analytical techniques treat the recruitment process as a Markov chain, which is inappropriate as individuals may recruit more than one individual. We show how stochastic context-free grammars (SCFGs) can be used to model the tree-like recruitment process, which allows us to test for non-random mixing between subpopulations (e.g. infected/uninfected), for independence of characteristics between recruitees of a given recruiter, and for differences in patterns of mixing between different populations. We discuss the similarity of the recruitment process to a multitype branching process and a stochastic susceptible-infected epidemiological model.