Motivation: For samples of unrelated individuals, we propose a general analysis framework in which hundred thousands of genetic loci can be tested simultaneously for association with complex phenotypes. The approach is built on spatial-clustering methodology, assuming that genetic loci that are associated with the target phenotype cluster in certain genomic regions. In contrast to standard methodology for multilocus analysis, which has focused on the dimension reduction of the data, our multilocus association-clustering test profits from the availability of large numbers of genetic loci by detecting clusters of loci that are associated with the phenotype.
Results: The approach is computationally fast and powerful, enabling the simultaneous association testing of large genomic regions. Even the entire genome or certain chromosomes can be tested simultaneously. Using simulation studies, the properties of the approach are evaluated. In an application to a genome-wide association study for chronic obstructive pulmonary disease, we illustrate the practical relevance of the proposed method by simultaneously testing all genotyped loci of the genome-wide association study and by testing each chromosome individually. Our findings suggest that statistical methodology that incorporates spatial-clustering information will be especially useful in whole-genome sequencing studies in which millions or billions of base pairs are recorded and grouped by genomic regions or genes, and are tested jointly for association.
Availability and implementation: Implementation of the approach is available upon request.
Supplementary data are available at Bioinformatics online.
Targeted and stringent measures of tuberculosis prevention are necessary to achieve the goal of tuberculosis elimination in countries of low tuberculosis incidence.
We ascertained the knowledge about tuberculosis risk factors and stringency of tuberculosis prevention measures by a standardized questionnaire among physicians in Germany involved in the care of individuals from classical risk groups for tuberculosis.
510 physicians responded to the online survey. Among 16 risk factors immunosuppressive therapy, HIV-infection and treatment with TNF-antagonist were thought to be the most important risk factors for the development of tuberculosis in Germany. Exposure to a patient with tuberculosis ranked on the 10th position. In the event of a positive tuberculin-skin-test or interferon-γ release assay only 50%, 40%, 36% and 25% of physicians found that preventive chemotherapy was indicated for individuals undergoing tumor necrosis factor-antagonist therapy, close contacts of tuberculosis patients, HIV-infected individuals and migrants, respectively.
A remarkably low proportion of individuals with latent infection with Mycobacterium tuberculosis belonging to classical risk groups for tuberculosis are considered candidates for preventive chemotherapy in Germany. Better knowledge about the risk for tuberculosis in different groups and more stringent and targeted preventive interventions will probably be necessary to achieve tuberculosis elimination in Germany.
Genetic association studies of longitudinal cognitive phenotypes are an alternate approach to discovering genetic risk factors for Alzheimer’s disease. However, the standard linear mixed model approach is limited in the face of multidimensional longitudinal data and multiple genotypes. In this setting, the principal components of heritability (PCH) approach may increase efficiency by deriving a linear combination of phenotypes to maximize the heritability attributable to a particular genetic locus. The current study investigated the performance of two PCH methods, the Principal Components of Heritability Association Test (PCHAT) and C2BAT, in detecting association of the known Alzheimer’s disease susceptibility allele APOE-ε4 with cognitive function at baseline and decline in cognition over time.
PCHAT, C2BAT, and standard linear mixed models were used to test for association between APOE-ε4 allele and performance on 19 neuropsychological tests using subjects without dementia at baseline from the Religious Orders Study (ROS) (n=693) and Memory and Aging Project (MAP) (n=778). Analyses were conducted across the three methods for three nested phenotype definitions (all 19 measures, executive function and episodic memory measures, and episodic memory only), and for baseline data only vs. longitudinal change.
In all cases, APOE-ε4 was significantly associated with baseline level of and change over time in cognitive function, and PCHAT and C2BAT yielded evidence of association comparable to or stronger than conventional methods.
PCHAT, C2BAT, and other PCH methods may have utility for genetic association studies of multidimensional cognitive and other phenotypes by maximizing genetic information while limiting multiple comparisons.
Principal components of heritability; multidimensional longitudinal data; cognitive decline; neuropsychological tests
Despite the numerous, successful applications of GWASs, there has been much difficulty in discovering DSLs. This is due to the fact that the GWAS approach is an indirect mapping technique, often identifying markers. For the identification of DSLs, which is required for the understanding of the genetic pathways for complex diseases, sequencing data that examines every genetic locus directly is necessary. Yet there is currently a lack of methodology targeted at the identification of the DSLs in sequencing data: existing methods localize the causal variant to a region, but not to a single variant and therefore do not allow one to identify unique loci that cause the phenotype association. Here, we have developed such a method to determine if there is evidence that an individual loci affects case-control status with sequencing data. This methodology differs from other rare variant approaches: rather than testing an entire region comprised of many loci for association with the phenotype, we can identify the individual genetic locus that causes the association between the phenotype and the genetic region. For each variant, the test determines if the pattern of LD across the other variants coincides with the pattern expected if that variant were a DSL. Power simulations show that the method successfully detects the causal variant, distinguishing it from other nearby variants (in high LD with the causal variant), and outperforms the standard tests. The efficiency of the method is especially apparent with small samples, which are currently realistic for studies due to sequence data costs. The practical relevance of the approach is illustrated by an application to a sequence dataset for nonsyndromic cleft lip with or without cleft palate. The proposed method implicated one variant (p=0.002, .062 after Bonferroni correction), which was not found by standard analyses. Code for implementation is available.
To study whether in vivo recruitment of dendritic cells (DCs) in response to antigen administration in the skin is altered during HIV-1 infection.
Skin punch biopsies were collected from HIV-1+ as well as seronegative individuals at 48 hours post intradermal injection of inactivated antigens of mumps virus, Candida albicans or purified protein derivate (PPD) from Mycobacterium tuberculosis.
Cryosections were analyzed by in situ staining and computerized imaging.
Control skin biopsies showed that there was no difference in the number of skin-resident DCs between seronegative and HIV-1+ individuals. Antigen injection resulted in substantial infiltration of DCs compared to the frequencies found in donor-matched control skin. In HIV-1+ individuals, CD123+/CD303+ plasmacytoid DCs and CD11c+ myeloid DCs, including the CD141+ cross-presenting subset, were recruited at lower levels compared to healthy controls in response to PPD and mumps but not C. albicans. The level of DC recruitment correlated with the frequencies of T cells infiltrating the respective antigen sites. Ki67+ cycling T cells at the injection sites were much more frequent in response to each of the antigens in the HIV-1+ individuals, including those with AIDS, compared to healthy controls.
Multiple DC subsets infiltrate the dermis in response to antigen exposure. There was no obvious depletion or deficiency in mobilization of DCs in response to antigen skin tests during chronic HIV-1 infection. Instead, the levels of antigen-specific memory T cells that accumulate at the antigen site may determine the level of DC infiltration.
HIV-1; dendritic cells; plasmacytoid; skin; skin test; delayed-type hypersensitivity reaction; Ki67
Antigen specific release of IP-10 is the most promising alternative marker to IFN-γ for infection with M. tuberculosis. Compared to Interferon-γ release assays (IGRA), IP-10 is released in high levels enabling novel approaches such as field friendly dried blood spots (DBS) and molecular detection.
To develop a robust IP-10 based molecular assay for the diagnosis of infection with M. tubercuolsis from whole blood and DBS.
We developed a one-step probe based multiplex RT-qPCR assay for detecting IP-10 and IFN-γ mRNA expression from whole blood and DBS samples. The assay was validated and applied for the diagnosis of M. tuberculosis infection in DBS samples from 43 patients with confirmed TB, 13 patients with latent TB and 96 presumed uninfected controls. In parallel, IP-10 and INF-γ levels were measured in Quantiferon (QFT-TB) plasma supernatants.
IP-10 mRNA upregulation was detectable at 4 hours after stimulation (6 fold upregulation) peaking at 8 hours (108 fold upregulation). IFN-γ expression occurred in concert but levels were lower (peak 6.7 fold upregulation). IP-10 gene expression level was significantly higher in patients with tuberculosis (median 31.2, IQR 10.7–67.0) and persons with latent tuberculosis infection (LTBI) (41.2, IQR 9.8–64.9) compared to healthy controls (1.6, IQR 1.1–2.4; p<0.0001). The IP-10 mRNA and protein based tests had comparable diagnostic accuracy to QFT-TB, sensitivity (85% and 88% vs 85%) and specificity (96% and 96% vs 97%, p = ns.).
We developed a rapid, robust and accurate molecular immunodiagnostic test for M. tuberculosis infection. By combining DBS based sample acquisition, mail or currier based sample transport with centralized molecular detection, this immunodiagnostic test concept can reduce the local technological requirements everywhere and make it possible to offer highly accurate immunodiagnostic tests in low resource settings.
The identification of pathogens directly from blood cultures by matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) can be a valuable tool for improving the treatment of patients with sepsis and bacteremia. However, the increasing incidence of multidrug-resistant Gram-negative bacteria makes it difficult to predict resistance patterns based only on pathogen identification. Most therapy regimens for sepsis caused by Gram-negative rods consist of at least one β-lactam antibiotic. Thus, it would be of great benefit to have an early marker of resistance against these drugs. In the current study, we tested 100 consecutive blood cultures containing Enterobacteriaceae for resistance against 3rd-generation cephalosporins in a MALDI-TOF MS β-lactamase assay. Escherichia coli was also tested for resistance against aminopenicillins. The results of the β-lactamase assay were compared with those of conventional methods. The assay permitted discrimination between E. coli strains that were resistant or susceptible to aminopenicillins with a sensitivity and a specificity of 100%. The same was true for resistance to 3rd-generation cephalosporins in Enterobacteriaceae that constitutively produced class C β-lactamases. Discrimination was more difficult in species expressing class A β-lactamases, as these enzymes can generate false-positive results. Thus, the sensitivity and specificity for this group were 100% and 91.5%, respectively. The test permitted the prediction of resistance within 2.5 h after the blood culture was flagged as positive.
Sputum smear microscopy is widely used for tuberculosis diagnosis and treatment monitoring. We evaluated the correlation between smear microscopy and time to liquid culture positivity during early tuberculosis treatment. The study included patients with smear-positive pulmonary tuberculosis hospitalized at a tuberculosis reference centre in Germany between 01/2012 and 05/2013. Patient records were reviewed and clinical, radiological and microbiological data were analysed. Sputum samples were collected before treatment initiation and weekly thereafter. A number of 310 sputum samples from 30 patients were analysed. Time to liquid culture positivity inversely correlated with smear grade (Spearman's rho −0.439, p<0.001). There was a better correlation within the first two months vs. after two months of therapy (−0.519 vs. −0.416) with a trend to a more rapid increase in time to positivity between baseline and week 2 in patients who culture-converted within the first two months (5.9 days vs. 9.4 days, p = 0.3). In conclusion, the numbers of acid-fast bacilli in sputum smears of patients with pulmonary tuberculosis and time to culture positivity for M. tuberculosis cultures from sputum are correlated before and during tuberculosis treatment. A considerable proportion of patients with culture conversion after two months of therapy continued to have detectable acid-fast bacilli on sputum smears.
Reversibility of airway obstruction in response to β2-agonists is highly variable among asthmatics, which is partially attributed to genetic factors. In a genome-wide association study of acute bronchodilator response (BDR) to inhaled albuterol, 534,290 single nucleotide polymorphisms (SNPs) were tested in 403 white trios from the Childhood Asthma Management Program using five statistical models to determine the most robust genetic associations. The primary replication phase included 1397 polymorphisms in three asthma trials (pooled n=764). The second replication phase tested 13 SNPs in three additional asthma populations (n=241, n=215, and n=592). An intergenic SNP on chromosome 10, rs11252394, proximal to several excellent biological candidates, significantly replicated (p=1.98×10−7) in the primary replication trials. An intronic SNP (rs6988229) in the collagen (COL22A1) locus also provided strong replication signals (p=8.51×10−6). This study applied a robust approach for testing the genetic basis of BDR and identified novel loci associated with this drug response in asthmatics.
pharmacogenetics; asthma; bronchodilator response; genome-wide association study; albuterol
Even in large-scale genome-wide association studies, only a fraction of the true associations are detected at the genome-wide significance level. When few or no associations reach the significance threshold, one strategy is to follow-up on the most promising candidates, i.e. the single nucleotide polymorphisms with the smallest association-test p-values, by genotyping them in additional studies. In this communication, we propose an overall test for genome-wide association studies that analyzes the SNP’s with the most promising p-values simultaneously and thereby allows an early assessment of whether the follow- up of the selected SNP’s is likely promising. We theoretically derive the properties of the proposed overall test under the null hypothesis and assess its power based on simulation studies. An application to a GWAS for chronic obstructive pulmonary disease suggests that there are true association signals among the top SNPs and that an additional follow-up study is promising.
genome wide association studies; snps association tests; chronic obstructive pulmonary disease; statistical genetics; multiple testing
Motivated by the challenges associated with accounting for the ascertainment when analyzing secondary phenotypes that are correlated with case-control status, Lin and Zeng have proposed a method that properly reflects the case-control sampling (Lin and Zeng, 2009). The Lin and Zeng method has the advantage of accurately estimating effect sizes for secondary phenotypes that are normally distributed or dichotomous. This method can be computationally intensive in practice under the null hypothesis when the likelihood surface that needs to be maximized can be relatively flat. We propose an extension of the Lin and Zeng method for hypothesis testing that uses proportional odds logistic regression to circumvent these computational issues. Through simulation studies, we compare the power and type-1 error rate of our method to standard approaches and Lin and Zeng's approach.
secondary phenotype; case-control study; ascertainment; genetic association; proportional odds logistic regression
In the present study, an integrated hierarchical approach was applied to: (1) identify pathways associated with susceptibility to schizophrenia; (2) detect genes that may be potentially affected in these pathways since they contain an associated polymorphism; and (3) annotate the functional consequences of such single-nucleotide polymorphisms (SNPs) in the affected genes or their regulatory regions. The Global Test was applied to detect schizophrenia-associated pathways using discovery and replication datasets comprising 5,040 and 5,082 individuals of European ancestry, respectively. Information concerning functional gene-sets was retrieved from the Kyoto Encyclopedia of Genes and Genomes, Gene Ontology, and the Molecular Signatures Database. Fourteen of the gene-sets or pathways identified in the discovery dataset were confirmed in the replication dataset. These include functional processes involved in transcriptional regulation and gene expression, synapse organization, cell adhesion, and apoptosis. For two genes, i.e. CTCF and CACNB2, evidence for association with schizophrenia was available (at the gene-level) in both the discovery study and published data from the Psychiatric Genomics Consortium schizophrenia study. Furthermore, these genes mapped to four of the 14 presently identified pathways. Several of the SNPs assigned to CTCF and CACNB2 have potential functional consequences, and a gene in close proximity to CACNB2, i.e. ARL5B, was identified as a potential gene of interest. Application of the present hierarchical approach thus allowed: (1) identification of novel biological gene-sets or pathways with potential involvement in the etiology of schizophrenia, as well as replication of these findings in an independent cohort; (2) detection of genes of interest for future follow-up studies; and (3) the highlighting of novel genes in previously reported candidate regions for schizophrenia.
Large-scale genetic studies of complex diseases such as schizophrenia have identified a variety of susceptibility loci. Since many of the respective variants have only a weak influence on disease risk, pathophysiological interpretation of the results is problematic. Investigation of the joint effects of multiple functionally related genes or pathways increases the power to detect disease related genes, and provides insights into the etiology of the disease in question. In the present study, an integrated hierarchical approach was applied to: (i) identify pathways associated with complex neuropsychiatric disease schizophrenia (ii) detect potentially affected genes in these pathways; and (iii) annotate the functional consequences of genetic markers in the affected genes or their regulatory regions. Two samples comprising >10,000 individuals of European ancestry as well as data from the Psychiatric Genomics Consortium schizophrenia study were examined. Pathways representing transcriptional regulation and gene expression, cell adhesion, apoptosis, and synapse organization showed significant association with schizophrenia. In particular, CTCF, CACNB2, and ARL5B, i.e. genes involved in chromatin modulation, calcium channel signaling and membrane transport, respectively, were highlighted as candidate genes for schizophrenia risk.
Maternal fish intake during pregnancy may influence risk of child asthma and allergic rhinitis, yet evidence is conflicting on its association with these outcomes.
We examined associations of maternal fish intake during pregnancy with child asthma and allergic rhinitis. Mothers in the Danish National Birth Cohort (N=28,936) reported their fish intake at 12 and 30 weeks of gestation. Using multivariate logistic regression, we examined associations of fish intake with child wheeze, asthma, and rhinitis assessed at several time points: ever wheeze, recurrent wheeze (>3 episodes), ever asthma and allergic rhinitis, and current asthma, assessed at 18 months (N~22,000) and 7 years (N~17,000) using self-report and registry data on hospitalizations and prescribed medications.
Compared to consistently high fish intake during pregnancy (fish as a sandwich or hot meal >=2-3 times/week), never eating fish was associated with higher risk of child asthma diagnosis at 18 months (1·30, 95%CI: 1·05, 1·63, P=0.02), and ever asthma by hospitalization (1·46, 95%CI: 0·99, 2·13, P=0.05) and medication prescription (1·37, 95%CI: 1·10, 1·71, P=0·01). A dose-response was present for asthma at 18 months only (P for trend: 0·001). We found no associations with wheeze or recurrent wheeze at 18 months or with allergic rhinitis.
Our results suggest that high (vs. no) maternal fish intake during pregnancy is protective against both early and ever asthma in 7 year old children.
fish; cohort study; asthma; allergic rhinitis
Against the background of increasing numbers of resistant microorganisms, the fast and cost-efficient detection of microbial resistance is an important clinical requirement for optimal therapeutic intervention. Current routine assays take at least 5 h, but in most cases an overnight incubation is necessary to identify resistant isolates. The usage of matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) profiling in combination with growth media containing isotopically labeled amino acids facilitates the detection of resistant microorganisms after 3 h or less directly from the profile spectrum. Growing microorganisms incorporate isotopically labeled amino acids, increasing protein masses and thereby leading to mass shifts of their corresponding peaks in the profile spectra. In the presence of antibiotics, only resistant microorganisms are able to grow and to incorporate the labeled amino acids. This leads to a difference in the mass spectra of susceptible and resistant isolates, allowing their differentiation. In the presented study, we demonstrated the applicability of this novel approach for the detection of methicillin-resistant Staphylococcus aureus and tested different bioinformatics approaches for automated data interpretation.
Cigarette smoking is the major environmental risk factor for chronic obstructive pulmonary disease (COPD). Genome-wide association studies have provided compelling associations for three loci with COPD. In this study, we aimed to estimate direct, i.e., independent from smoking, and indirect effects of those loci on COPD development using mediation analysis. We included a total of 3,424 COPD cases and 1,872 unaffected controls with data on two smoking-related phenotypes: lifetime average smoking intensity and cumulative exposure to tobacco smoke (pack years). Our analysis revealed that effects of two linked variants (rs1051730 and rs8034191) in the AGPHD1/CHRNA3 cluster on COPD development are significantly, yet not entirely, mediated by the smoking-related phenotypes. Approximately 30 % of the total effect of variants in the AGPHD1/CHRNA3 cluster on COPD development was mediated by pack years. Simultaneous analysis of modestly (r2 = 0.21) linked markers in CHRNA3 and IREB2 revealed that an even larger (~42 %) proportion of the total effect of the CHRNA3 locus on COPD was mediated by pack years after adjustment for an IREB2 single nucleotide polymorphism. This study confirms the existence of direct effects of the AGPHD1/CHRNA3, IREB2, FAM13A and HHIP loci on COPD development. While the association of the AGPHD1/CHRNA3 locus with COPD is significantly mediated by smoking-related phenotypes, IREB2 appears to affect COPD independently of smoking.
For the analysis of rare-variant data in population-based designs, we propose a method to detect study subjects that may create population substructure in the study sample. Our approach is computationally fast and simple, permitting applications to whole-genome sequencing studies. The method does not require the variants to be in linkage equilibrium and can be applied to all the genetic loci that are available in the study. For both rare and common variants, we assess the performance of our approach by its application to the 1000 Genome Project data, and in simulation studies. The results are compared to the commonly used outlier detection algorithm based on principal component analysis (PCA). The statistical power of both approaches to detect outliers are comparable in most of the scenarios, but the power of PCA to detect outliers is lower than the novel approach in the presence of linkage disequilibrium and for subpopulations that are genetically similar. The data analysis and the simulation studies suggest that the number of false-positive results appears to be different for the two approaches. Our approach maintains the type I error rate while the outlier detection approach based on PCA does not. Taking additionally into account the minimal computational requirements of our approach and the ability to incorporate all the marker information, the proposed method will have important application in sequencing studies and genome-wide association studies.
population substructure; outlier detection; GWAS; sequence data
Increased susceptibility to tuberculosis following HIV-1 seroconversion contributes significantly to the tuberculosis epidemic in sub-saharan Africa. Lung specific mechanisms underlying the interaction between HIV-1 and Mycobacterium (M.) tuberculosis infection are incompletely understood. This study addressed the effect of HIV-1 and latent M. tuberculosis infection on viral-entry receptors and ligands in bronchoalveolar lavage (BAL). Median fluorescence intensity (MFI) of entry receptor expression was measured by multiparameter flow cytometry and chemokine expression by multiplex bead array.
Irrespective of HIV-1 status, BAL T-cells expressed higher MFI for the beta-chemokine receptor (CCR)5 than peripheral blood T-cells (p<0.001), in particular the CD8+ T-cells of HIV-1 infected persons showed elevated CCR5 expression (p=0.026). The concentration of BAL CCR5 ligands, regulated upon activation normal T-cell expressed and secreted (RANTES; p<0.001) and macrophage inflammatory protein (MIP)-1β (p=0.004) were elevated in the BAL of HIV-1 infected persons compared to controls. CCR5 expression and RANTES concentration correlated strongly with HIV-1 viral load in BAL. By contrast these alterations were not associated with M. tuberculosis sensitization in vivo nor did M. tuberculosis infection of BAL cells ex vivo change RANTES expression.
These data suggest ongoing HIV-1 replication predominantly drives local pulmonary CCR5+ T-cell activation in HIV/latent M. tuberculosis co-infection.
BAL; CCR5; RANTES; TB; viral load
In genome wide association studies (GWAS), family-based studies tend to have less power to detect genetic associations than population-based studies, such as case-control studies. This can be an issue when testing if genes in a family-based GWAS have a direct effect on the phenotype of interest over and above their possible indirect effect through a secondary phenotype. When multiple SNPs are tested for a direct effect in the family-based study, a screening step can be used to minimize the burden of multiple comparisons in the causal analysis. We propose a 2-stage screening step that can be incorporated into the family-based association test (FBAT) approach similar to the conditional mean model approach in the Van Steen-algorithm (Van Steen et al., 2005). Simulations demonstrate that the type 1 error is preserved and this method is advantageous when multiple markers are tested. This method is illustrated by an application to the Framingham Heart Study.
family-based association analysis; causal inference; genetic pathway; mediation; pleiotropy
The advent of genome-wide association studies has led to many novel disease-SNP associations, opening the door to focused study on their biological underpinnings. Because of the importance of analyzing these associations, numerous statistical methods have been devoted to them. However, fewer methods have attempted to associate entire genes or genomic regions with outcomes, which is potentially more useful knowledge from a biological perspective and those methods currently implemented are often permutation-based.
One property of some permutation-based tests is that their power varies as a function of whether significant markers are in regions of linkage disequilibrium (LD) or not, which we show from a theoretical perspective. We therefore develop two methods for quantifying the degree of association between a genomic region and outcome, both of whose power does not vary as a function of LD structure. One method uses dimension reduction to “filter” redundant information when significant LD exists in the region, while the other, called the summary-statistic test, controls for LD by scaling marker Z-statistics using knowledge of the correlation matrix of markers. An advantage of this latter test is that it does not require the original data, but only their Z-statistics from univariate regressions and an estimate of the correlation structure of markers, and we show how to modify the test to protect the type 1 error rate when the correlation structure of markers is misspecified. We apply these methods to sequence data of oral cleft and compare our results to previously proposed gene tests, in particular permutation-based ones. We evaluate the versatility of the modification of the summary-statistic test since the specification of correlation structure between markers can be inaccurate.
We find a significant association in the sequence data between the 8q24 region and oral cleft using our dimension reduction approach and a borderline significant association using the summary-statistic based approach. We also implement the summary-statistic test using Z-statistics from an already-published GWAS of Chronic Obstructive Pulmonary Disorder (COPD) and correlation structure obtained from HapMap. We experiment with the modification of this test because the correlation structure is assumed imperfectly known.
Dimension reduction; Eigenvector; Gene-based testing; Permutation tests
Population stratification leads to a predictable phenomenon—a reduction in the number of heterozygotes compared to that calculated assuming Hardy-Weinberg Equilibrium (HWE). We show that population stratification results in another phenomenon—an excess in the proportion of spouse-pairs with the same genotypes at all ancestrally informative markers, resulting in ancestrally related positive assortative mating. We use principal components analysis to show that there is evidence of population stratification within the Framingham Heart Study, and show that the first principal component correlates with a North-South European cline. We then show that the first principal component is highly correlated between spouses (r=0.58, p=0.0013), demonstrating that there is ancestrally related positive assortative mating among the Framingham Caucasian population. We also show that the single nucleotide polymorphisms loading most heavily on the first principal component show an excess of homozygotes within the spouses, consistent with similar ancestry-related assortative mating in the previous generation. This nonrandom mating likely affects genetic structure seen more generally in the North American population of European descent today, and decreases the rate of decay of linkage disequilibrium for ancestrally informative markers.
population stratification; non-random mating; Hardy-Weinberg equilibrium
We have conducted the first meta-analyses for nonsyndromic cleft lip with or without cleft palate (NSCL/P) using data from the two largest genome-wide association studies published to date. We confirmed associations with all previously identified loci and identified six additional susceptibility regions (1p36, 2p21, 3p11.1, 8q21.3, 13q31.1 and 15q22). Analysis of phenotypic variability identified the first specific genetic risk factor for NSCLP (nonsyndromic cleft lip plus palate) (rs8001641; PNSCLP = 6.51 × 10−11; homozygote relative risk = 2.41, 95% confidence interval (CI) 1.84–3.16).
In infection experiments with genetically distinct Mycobacterium tuberculosis complex (MTBC) strains, we identified clade-specific virulence patterns in human primary macrophages and in mice infected by the aerosol route, both reflecting relevant model systems. Exclusively human-adapted M. tuberculosis lineages, also termed clade I, comprising “modern” lineages, such as Beijing and Euro-American Haarlem strains, showed a significantly enhanced capability to grow compared to that of clade II strains, which include “ancient” lineages, such as, e.g., East African Indian or M. africanum strains. However, a simple correlation of inflammatory response profiles with strain virulence was not apparent. Overall, our data reveal three different pathogenic profiles: (i) strains of the Beijing lineage are characterized by low uptake, low cytokine induction, and a high replicative potential, (ii) strains of the Haarlem lineage by high uptake, high cytokine induction, and high growth rates, and (iii) EAI strains by low uptake, low cytokine induction, and a low replicative potential. Our findings have significant implications for our understanding of host-pathogen interaction and factors that modulate the outcomes of infections. Future studies addressing the underlying mechanisms and clinical implications need to take into account the diversity of both the pathogen and the host.
Clinical strains of the Mycobacterium tuberculosis complex (MTBC) are genetically more diverse than previously anticipated. Our analysis of mycobacterial growth characteristics in primary human macrophages and aerogenically infected mice shows that the MTBC genetic differences translate into pathogenic differences in the interaction with the host. Our study reveals for the first time that “TB is not TB,” if put in plain terms. We are convinced that it is very unlikely that a single molecular mechanism may explain the observed effects. Our study refutes the hypothesis that there is a simple correlation between cytokine induction as a single functional parameter of host interaction and mycobacterial virulence. Instead, careful consideration of strain- and lineage-specific characteristics must guide our attempts to decipher what determines the pathological potential and thus the outcomes of infection with MTBC, one of the most important human pathogens.
Rationale: Variability in pulmonary disease severity is found in patients with cystic fibrosis (CF) who have identical mutations in the CF transmembrane conductance regulator (CFTR) gene. We hypothesized that one factor accounting for heterogeneity in pulmonary disease severity is variation in the family of genes affecting the biology of interleukin-1 (IL-1), which impacts acquisition and maintenance of Pseudomonas aeruginosa infection in animal models of chronic infection. Methods: We genotyped 58 single nucleotide polymorphisms (SNPs) in the IL-1 gene cluster in 808 CF subjects from the University of North Carolina and Case Western Reserve University (UNC/CWRU) joint cohort. All were homozygous for ΔF508, and categories of “severe” (cases) or “mild” (control subjects) lung disease were defined by the lowest or highest quartile of forced expired volume (FEV1) for age in the CF population. After adjustment for age and gender, genotypic data were tested for association with lung disease severity. Odds ratios (ORs) comparing severe versus mild CF were also calculated for each genotype (with the homozygote major allele as the reference group) for all 58 SNPs. From these analyses, nine SNPs with a moderate effect size, OR ≤ 0.5or > 1.5, were selected for further testing. To replicate the case-control study results, we genotyped the same nine SNPs in a second population of CF parent-offspring trios (recruited from Children’s Hospital Boston), in which the offspring had similar pulmonary phenotypes. For the trio analysis, both family-based and population-based associations were performed. Results: SNPs rs1143634 and rs1143639 in the IL1B gene demonstrated a consistent association with lung disease severity categories (P < 0.10) and longitudinal analysis of lung disease severity (P < 0.10) in CF in both the case-control and family-based studies. In females, there was a consistent association (false discovery rate adjusted joint P-value < 0.06 for both SNPs) in both the analysis of lung disease severity in the UNC/CWRU cohort and the family-based analysis of affection status. Conclusion: Our findings suggest that IL1β is a clinically relevant modulator of CF lung disease.
gene modifiers; cystic fibrosis; CFTR; IL-1 gene family
IL10 is an anti-inflammatory cytokine that has been found to have lower production in macrophages and mononuclear cells from asthmatics. Since reduced IL10 levels may influence the severity of asthma phenotypes, we examined IL10 single-nucleotide polymorphisms (SNPs) for association with asthma severity and allergy phenotypes as quantitative traits. Utilizing DNA samples from 518 Caucasian asthmatic children from the Childhood Asthma Management Program (CAMP) and their parents, we genotyped six IL10 SNPs: 3 in the promoter, 2 in introns, and one in the 3′ UTR. Using family-based association tests, each SNP was tested for association with asthma and allergy phenotypes individually. Population-based association analysis was performed with each SNP locus, the promoter haplotypes and the 6-loci haplotypes. The 3′ UTR SNP was significantly associated with FEV1 as a percent of predicted (FEV1PP) (P=0.0002) in both the family and population analyses. The promoter haplotype GCC was positively associated with IgE levels and FEV1PP (P=0.007 and 0.012, respectively). The promoter haplotype ATA was negatively associated with lnPC20 and FEV1PP (P=0.008 and 0.043, respectively). Polymorphisms in IL10 are associated with asthma phenotypes in this cohort. Further studies of variation in the IL10 gene may help elucidate the mechanism of asthma development in children.
interleukin 10 (IL10); single nucleotide polymorphism (SNP); genetic association; family-based association test (FBAT); haplotype; promoter; 3′; untranslated region (3′UTR)
Many Genome-Wide Association Studies (GWAS) have signals with unknown etiology. This paper addresses the question — is such an association signal caused by rare or common variants that lead to increased disease risk? For a genomic region implicated by a GWAS, we use Single Nucleotide Polymorphism (SNP) data in a case-control setting to predict how many common or rare variants there are, using a Bayesian analysis. Our objective is to compute posterior probabilities for configurations of rare and/or common variants. We use an extension of coalescent trees — the Ancestral Recombination Graphs (ARG) — to model the genealogical history of the samples based on marker data. As we expect SNPs to be in Linkage Disequilibrium (LD) with common disease variants, we can expect the trees to reflect on the type of variants. To demonstrate the application, we apply our method to candidate gene sequencing data from a German case-control study on nonsyndromic cleft lip with or without cleft palate (NSCL/P).
Coalescent Tree; Genetic Association; Rare Variant; Common Variant; Ancestral Recombination Graphs; Bayesian Modeling