Search tips
Search criteria

Results 1-11 (11)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  Revealing the Complexity of Health Determinants in Resource-poor Settings 
American Journal of Epidemiology  2012;176(11):1051-1059.
An epidemiologic systems analysis of diarrhea in children in Pakistan is presented. Application of additive Bayesian network modeling to 2005–2006 data from the Pakistan Social and Living Standards Measurement Survey reveals the complexity of child diarrhea as a disease system. The key distinction between standard analytical approaches, such as multivariable regression, and Bayesian network analyses is that the latter attempt to not only identify statistically associated variables but also, additionally and empirically, separate these into those directly and indirectly dependent upon the outcome variable. Such discrimination is vastly more ambitious but has the potential to reveal far more about key features of complex disease systems. Additive Bayesian network analyses across 41 variables from the Pakistan Social and Living Standards Measurement Survey identified 182 direct dependencies but with only 3 variables: 1) access to a dry pit latrine (protective; odds ratio = 0.67); 2) access to an atypical water source (protective; odds ratio = 0.49); and 3) no formal garbage collection (unprotective; odds ratio = 1.32), supported as directly dependent with the presence of diarrhea. All but 2 of the remaining variables were also, in turn, directly or indirectly dependent upon these 3 key variables. These results are contrasted with the use of a standard approach (multivariable regression).
PMCID: PMC3571241  PMID: 23139247
Bayesian network; diarrhea; epidemiologic determinants; graphical model; socioeconomic factors
2.  T-cell reprogramming through targeted CD4-coreceptor and T-cell receptor expression on maturing thymocytes by latent Circoviridae family member porcine circovirus type 2 cell infections in the thymus 
Although porcine circovirus type 2 (PCV2)-associated diseases have been evaluated for known immune evasion strategies, the pathogenicity of these viruses remained concealed for decades. Surprisingly, the same viruses that cause panzootics in livestock are widespread in young, unaffected animals. Recently, evidence has emerged that circovirus-like viruses are also linked to complex diseases in humans, including children. We detected PCV2 genome-carrying cells in fetal pig thymi. To elucidate virus pathogenicity, we developed a new pig infection model by in vivo transfection of recombinant PCV2 and the immunosuppressant cofactor cyclosporine A. Using flow cytometry, immunofluorescence and fluorescence in situ hybridization, we found evidence that PCV2 dictates positive and negative selection of maturing T cells in the thymus. We show for the first time that PCV2-infected cells reside at the corticomedullary junction of the thymus. In diseased animals, we found polyclonal deletion of single positive cells (SPs) that may result from a loss of major histocompatibility complex class-II expression at the corticomedullary junction. The percentage of PCV2 antigen-presenting cells correlated with the degree of viremia and, in turn, the severity of the defect in thymocyte maturation. Moreover, the reversed T-cell receptor/CD4-coreceptor expression dichotomy on thymocytes at the CD4+CD8interm and CD4SP cell stage is viremia-dependent, resulting in a specific hypo-responsiveness of T-helper cells. We compare our results with the only other better-studied member of Circoviridae, chicken anemia virus. Our data show that PCV2 infection leads to thymocyte selection dysregulation, adding a valuable dimension to our understanding of virus pathogenicity.
PMCID: PMC4355439  PMID: 26038767
adaptive immune response failure; CD4+ thymocyte maturation diversion; Circoviridae porcine circovirus type 2 pathogenicity; dendritic cell feedback; in vivo anergy; polyclonal negative selection; T-helper cell hypo-responsiveness; thymic kinetic signaling model
3.  Dynamics of the Force of Infection: Insights from Echinococcus multilocularis Infection in Foxes 
Characterizing the force of infection (FOI) is an essential part of planning cost effective control strategies for zoonotic diseases. Echinococcus multilocularis is the causative agent of alveolar echinococcosis in humans, a serious disease with a high fatality rate and an increasing global spread. Red foxes are high prevalence hosts of E. multilocularis. Through a mathematical modelling approach, using field data collected from in and around the city of Zurich, Switzerland, we find compelling evidence that the FOI is periodic with highly variable amplitude, and, while this amplitude is similar across habitat types, the mean FOI differs markedly between urban and periurban habitats suggesting a considerable risk differential. The FOI, during an annual cycle, ranges from (0.1,0.8) insults (95% CI) in urban habitat in the summer to (9.4, 9.7) (95% CI) in periurban (rural) habitat in winter. Such large temporal and spatial variations in FOI suggest that control strategies are optimal when tailored to local FOI dynamics.
Author Summary
Human alveolar echinococcosis (AE) is caused by the fox tapeworm E. multilocularis and has a high fatality rate if untreated. The frequency of the tapeworm in foxes can be reduced through the regular distribution of anthelmintic baits and thus decrease the risk of zoonotic transmission. Here, we estimate the force of infection to foxes using a mathematical model and data from necropsied foxes. The results suggest that the frequency of anthelmintic baiting of foxes can be optimised to local variations in transmission that depend upon season and type of fox habitat.
PMCID: PMC3961194  PMID: 24651596
4.  Improving epidemiologic data analyses through multivariate regression modelling 
Regression modelling is one of the most widely utilized approaches in epidemiological analyses. It provides a method of identifying statistical associations, from which potential causal associations relevant to disease control may then be investigated. Multivariable regression – a single dependent variable (outcome, usually disease) with multiple independent variables (predictors) – has long been the standard model. Generalizing multivariable regression to multivariate regression – all variables potentially statistically dependent – offers a far richer modelling framework. Through a series of simple illustrative examples we compare and contrast these approaches. The technical methodology used to implement multivariate regression is well established – Bayesian network structure discovery – and while a relative newcomer to the epidemiological literature has a long history in computing science. Applications of multivariate analysis in epidemiological studies can provide a greater understanding of disease processes at the population level, leading to the design of better disease control and prevention programs.
PMCID: PMC3691873  PMID: 23683753
5.  A tutorial in estimating the prevalence of disease in humans and animals in the absence of a gold standard diagnostic 
Epidemiological methods for estimating disease prevalence in humans and other animals in the absence of a gold standard diagnostic test are well established. Despite this, reporting apparent prevalence is still standard practice in public health studies and disease control programmes, even though apparent prevalence may differ greatly from the true prevalence of disease. Methods for estimating true prevalence are summarized and reviewed. A computing appendix is also provided which contains a brief guide in how to easily implement some of the methods presented using freely available software.
PMCID: PMC3558341  PMID: 23270542
6.  Identifying associations between pig pathologies using a multi-dimensional machine learning methodology 
Abattoir detected pathologies are of crucial importance to both pig production and food safety. Usually, more than one pathology coexist in a pig herd although it often remains unknown how these different pathologies interrelate to each other. Identification of the associations between different pathologies may facilitate an improved understanding of their underlying biological linkage, and support the veterinarians in encouraging control strategies aimed at reducing the prevalence of not just one, but two or more conditions simultaneously.
Multi-dimensional machine learning methodology was used to identify associations between ten typical pathologies in 6485 batches of slaughtered finishing pigs, assisting the comprehension of their biological association. Pathologies potentially associated with septicaemia (e.g. pericarditis, peritonitis) appear interrelated, suggesting on-going bacterial challenges by pathogens such as Haemophilus parasuis and Streptococcus suis. Furthermore, hepatic scarring appears interrelated with both milk spot livers (Ascaris suum) and bacteria-related pathologies, suggesting a potential multi-pathogen nature for this pathology.
The application of novel multi-dimensional machine learning methodology provided new insights into how typical pig pathologies are potentially interrelated at batch level. The methodology presented is a powerful exploratory tool to generate hypotheses, applicable to a wide range of studies in veterinary research.
PMCID: PMC3483212  PMID: 22937883
7.  Network modeling of BVD transmission 
Veterinary Research  2012;43(1):11.
Endemic diseases of cattle, such as bovine viral diarrhea, have significant impact on production efficiency of food of animal origin with consequences for animal welfare and climate change reduction targets. Many modeling studies focus on the local scale, examining the on-farm dynamics of this infectious disease. However, insight into prevalence and control across a network of farms ultimately requires a network level approach. Here, we implement understanding of infection dynamics, gained through these detailed on-farm modeling studies, to produce a national scale model of bovine viral diarrhea virus transmission. The complex disease epidemiology and on-farm dynamics are approximated using SIS dynamics with each farm treated as a single unit. Using a top down approach, we estimate on-farm parameters associated with contraction and subsequent clearance from infection at herd level. We examine possible control strategies associated with animal movements between farms and find measures targeted at a small number of high-movement farms efficient for rapid and sustained prevalence reduction.
PMCID: PMC3295666  PMID: 22325043
8.  Spidermonkey: rapid detection of co-evolving sites using Bayesian graphical models 
Bioinformatics  2008;24(17):1949-1950.
Spidermonkey is a new component of the Datamonkey suite of phylogenetic tools that provides methods for detecting coevolving sites from a multiple alignment of homologous nucleotide or amino acid sequences. It reconstructs the substitution history of the alignment by maximum likelihood-based phylogenetic methods, and then analyzes the joint distribution of substitution events using Bayesian graphical models to identify significant associations among sites.
Availability: Spidermonkey is publicly available both as a web application at and as a stand-alone component of the phylogenetic software package HyPhy, which is freely distributed on the web ( as precompiled binaries and open source.
PMCID: PMC2732215  PMID: 18562270
9.  Bayesian inference for within-herd prevalence of Leptospira interrogans serovar Hardjo using bulk milk antibody testing 
Biostatistics (Oxford, England)  2009;10(4):719-728.
Leptospirosis is the most widespread zoonosis throughout the world and human mortality from severe disease forms is high even when optimal treatment is provided. Leptospirosis is also one of the most common causes of reproductive losses in cattle worldwide and is associated with significant economic costs to the dairy farming industry. Herds are tested for exposure to the causal organism either through serum testing of individual animals or through testing bulk milk samples. Using serum results from a commonly used enzyme-linked immunosorbent assay (ELISA) test for Leptospira interrogans serovar Hardjo (L. hardjo) on samples from 979 animals across 12 Scottish dairy herds and the corresponding bulk milk results, we develop a model that predicts the mean proportion of exposed animals in a herd conditional on the bulk milk test result. The data are analyzed through use of a Bayesian latent variable generalized linear mixed model to provide estimates of the true (but unobserved) level of exposure to the causal organism in each herd in addition to estimates of the accuracy of the serum ELISA. We estimate 95% confidence intervals for the accuracy of the serum ELISA of (0.688, 0.987) and (0.975, 0.998) for test sensitivity and specificity, respectively. Using a percentage positivity cutoff in bulk milk of at most 41% ensures that there is at least a 97.5% probability of less than 5% of the herd being exposed to L. hardjo. Our analyses provide strong statistical evidence in support of the validity of interpreting bulk milk samples as a proxy for individual animal serum testing. The combination of validity and cost-effectiveness of bulk milk testing has the potential to reduce the risk of human exposure to leptospirosis in addition to offering significant economic benefits to the dairy industry.
PMCID: PMC2742498  PMID: 19628639
Bayesian; Latent class analysis; Leptospirosis
10.  An Evolutionary-Network Model Reveals Stratified Interactions in the V3 Loop of the HIV-1 Envelope 
PLoS Computational Biology  2007;3(11):e231.
The third variable loop (V3) of the human immunodeficiency virus type 1 (HIV-1) envelope is a principal determinant of antibody neutralization and progression to AIDS. Although it is undoubtedly an important target for vaccine research, extensive genetic variation in V3 remains an obstacle to the development of an effective vaccine. Comparative methods that exploit the abundance of sequence data can detect interactions between residues of rapidly evolving proteins such as the HIV-1 envelope, revealing biological constraints on their variability. However, previous studies have relied implicitly on two biologically unrealistic assumptions: (1) that founder effects in the evolutionary history of the sequences can be ignored, and; (2) that statistical associations between residues occur exclusively in pairs. We show that comparative methods that neglect the evolutionary history of extant sequences are susceptible to a high rate of false positives (20%–40%). Therefore, we propose a new method to detect interactions that relaxes both of these assumptions. First, we reconstruct the evolutionary history of extant sequences by maximum likelihood, shifting focus from extant sequence variation to the underlying substitution events. Second, we analyze the joint distribution of substitution events among positions in the sequence as a Bayesian graphical model, in which each branch in the phylogeny is a unit of observation. We perform extensive validation of our models using both simulations and a control case of known interactions in HIV-1 protease, and apply this method to detect interactions within V3 from a sample of 1,154 HIV-1 envelope sequences. Our method greatly reduces the number of false positives due to founder effects, while capturing several higher-order interactions among V3 residues. By mapping these interactions to a structural model of the V3 loop, we find that the loop is stratified into distinct evolutionary clusters. We extend our model to detect interactions between the V3 and C4 domains of the HIV-1 envelope, and account for the uncertainty in mapping substitutions to the tree with a parametric bootstrap.
Author Summary
The third variable loop (V3) of the human immunodeficiency virus type 1 (HIV-1) envelope is a principal determinant of viral growth characteristics and an important target for the immune system. Interactions between residues of V3 allow the virus to shift between combinations of residues to escape the immune system while retaining its structure and functions. Comparative study of HIV-1 V3 sequences can detect such interactions by the covariation of sites in the sequence, which can then be used to inform vaccine development, but current methods for detecting such associations rely on biologically unrealistic assumptions. We demonstrate that these assumptions cause an excessive number of spurious associations, and present a new approach that couples phylogenetic and Bayesian network models, and greatly reduces this number while retaining the ability to detect real associations. Our analysis reveals that the V3 loop is stratified into discrete layers of interacting residues, suggesting a partition of functions along this viral structure with implications for vaccine development.
PMCID: PMC2082504  PMID: 18039027
11.  Evolutionary Interactions between N-Linked Glycosylation Sites in the HIV-1 Envelope 
PLoS Computational Biology  2007;3(1):e11.
The addition of asparagine (N)-linked polysaccharide chains (i.e., glycans) to the gp120 and gp41 glycoproteins of human immunodeficiency virus type 1 (HIV-1) envelope is not only required for correct protein folding, but also may provide protection against neutralizing antibodies as a “glycan shield.” As a result, strong host-specific selection is frequently associated with codon positions where nonsynonymous substitutions can create or disrupt potential N-linked glycosylation sites (PNGSs). Moreover, empirical data suggest that the individual contribution of PNGSs to the neutralization sensitivity or infectivity of HIV-1 may be critically dependent on the presence or absence of other PNGSs in the envelope sequence. Here we evaluate how glycan–glycan interactions have shaped the evolution of HIV-1 envelope sequences by analyzing the distribution of PNGSs in a large-sequence alignment. Using a “covarion”-type phylogenetic model, we find that the rates at which individual PNGSs are gained or lost vary significantly over time, suggesting that the selective advantage of having a PNGS may depend on the presence or absence of other PNGSs in the sequence. Consequently, we identify specific interactions between PNGSs in the alignment using a new paired-character phylogenetic model of evolution, and a Bayesian graphical model. Despite the fundamental differences between these two methods, several interactions are jointly identified by both. Mapping these interactions onto a structural model of HIV-1 gp120 reveals that negative (exclusive) interactions occur significantly more often between colocalized glycans, while positive (inclusive) interactions are restricted to more distant glycans. Our results imply that the adaptive repertoire of alternative configurations in the HIV-1 glycan shield is limited by functional interactions between the N-linked glycans. This represents a potential vulnerability of rapidly evolving HIV-1 populations that may provide useful glycan-based targets for neutralizing antibodies.
Author Summary
Many viruses exploit the complex machinery of the host cell to modify their own proteins, by the enzymatic addition of sugar molecules to specific amino acids. These sugars, or “glycans,” play several important roles in the infective cycle of the virus. The envelope of the human immunodeficiency virus type 1 (HIV-1), for example, becomes coated with so many glycans that the virus can become invisible to the protein-specific immune response of the host. Although some glycans are evolutionarily conserved, many others may be present within some hosts but absent in others, and may even appear or disappear over the course of an infection in a single host. To understand this variability, we have analyzed HIV-1 envelope sequences to identify cases where the presence of one glycan was dependent on the presence or absence of another (called glycan–glycan interactions). We used two newly developed computational methods to detect these interactions, thereby providing conclusive evidence of a new fundamental pattern: the glycans that exclude each other tend to occur near the same spot on the envelope, whereas glycans that occur together tend to be far apart.
PMCID: PMC1779302  PMID: 17238283

Results 1-11 (11)