Analyzing infectious disease data presents unique challenges for epidemiologists and biostatisticians. Unlike chronic diseases, for which a person’s risk depends only on their personal exposures and risk factors, the risk that someone will acquire an infection is inherently dependent on whether others in the population are infected. As such, traditional statistical methods that assume outcomes are independent cannot always be applied to infectious disease data, and novel statistical methods are often needed. Furthermore, to understand the potential impact of interventions, one must account for the non-linear feedbacks that give rise to population patterns of infection and disease. This requires blending epidemiologic methods with ecologic principles to develop models for transmission dynamics, models that play an integral role in our understanding of infectious disease epidemiology (Figure).
An excellent illustration of the synthesis of data collection, analysis, and transmission dynamic models is provided by Lipsitch and colleagues.1 The authors analyze longitudinal data of Streptococcus pneumoniae nasopharyngeal carriage among children from Kilifi, Kenya. The study design and data collection themselves represent a remarkable feat, with baseline assessment of the carrier status of 2840 children and multiple follow-up assessments of 1868 children who were positive at baseline.1–3 Analyzing these data to obtain unbiased estimates of the serotype-specific rates of acquisition, clearance, and resistance to competition for 27 unique pneumococcal serotypes presents a considerable added challenge. In addition to individual risk factors such as age, the children’s ability to clear colonization could depend on the resident serotype, how often they are exposed to someone shedding a different serotype, how resistant the resident serotype is to competition from other serotypes, and how good those serotypes are at competing. Thus, all the rates of interest are dependent on one another.
To address this challenge, Lipsitch et al. apply a Markov transition model.1 This approach allows them to estimate the competing risks of clearance of the baseline serotype and switching to each of the other serotypes. It also allows for estimation of the rate of acquisition for each of the 27 serotypes from the baseline prevalence distribution. This approach has considerable advantages over previous attempts to estimate serotype-specific parameters for S. pneumoniae, which were limited because they grouped vaccine serotypes and non-vaccine serotypes, estimated parameters for only a limited number of serotypes, assumed certain parameters were the same for all serotypes, or defined separate models for each serotype.3–9 The large, longitudinal cohort study conducted in Kilifi presents a unique opportunity to estimate these parameters, providing data that are essential for understanding competition and coexistence among pneumococcal serotypes.