Schizophrenia is a highly heritable disorder. Genetic risk is conferred by a large number of alleles, including common alleles of small effect that might be detected by genome-wide association studies. Here, we report a multi-stage schizophrenia genome-wide association study of up to 36,989 cases and 113,075 controls. We identify 128 independent associations spanning 108 conservatively defined loci that meet genome-wide significance, 83 of which have not been previously reported. Associations were enriched among genes expressed in brain providing biological plausibility for the findings. Many findings have the potential to provide entirely novel insights into aetiology, but associations at DRD2 and multiple genes involved in glutamatergic neurotransmission highlight molecules of known and potential therapeutic relevance to schizophrenia, and are consistent with leading pathophysiological hypotheses. Independent of genes expressed in brain, associations were enriched among genes expressed in tissues that play important roles in immunity, providing support for the hypothesized link between the immune system and schizophrenia.
Large case/control genome-wide association studies (GWAS) often include groups of related individuals with known relationships. When testing for associations at a given locus, current methods incorporate only the familial relationships between individuals. Here, we introduce the chromosome-based Quasi Likelihood Score (cQLS) statistic that incorporates local Identity-By-Descent (IBD) to increase the power to detect associations. In studies robust to population stratification, such as those with case/control sibling pairs, simulations show that the study power can be increased by over 50%. In our example, a GWAS examining late-onset Alzheimers disease, the p-values among the most strongly associated SNPs in the APOE gene tend to decrease, with the smallest p-value decreasing from 1.23 × 10−8 to 7.70 × 10−9. Furthermore, as a part of our simulations, we reevaluate our expectations about the use of families in GWAS. We show that, although adding only half as many unique chromosomes, genotyping affected siblings is more efficient than genotyping randomly ascertained cases. We also show that genotyping cases with a family history of disease will be less beneficial when searching for SNPs with smaller effect sizes.
cQLS; GWAS; related individuals; case-control
We tested the hypothesis that an altered community of gut microbes is associated with risk of colorectal cancer (CRC) in a study of 47 CRC case subjects and 94 control subjects. 16S rRNA genes in fecal bacterial DNA were amplified by universal primers, sequenced by 454 FLX technology, and aligned for taxonomic classification to microbial genomes using the QIIME pipeline. Taxonomic differences were confirmed with quantitative polymerase chain reaction and adjusted for false discovery rate. All statistical tests were two-sided. From 794217 16S rRNA gene sequences, we found that CRC case subjects had decreased overall microbial community diversity (P = .02). In taxonomy-based analyses, lower relative abundance of Clostridia (68.6% vs 77.8%) and increased carriage of Fusobacterium (multivariable odds ratio [OR] = 4.11; 95% confidence interval [CI] = 1.62 to 10.47) and Porphyromonas (OR = 5.17; 95% CI = 1.75 to 15.25) were found in case subjects compared with control subjects. Because of the potentially modifiable nature of the gut bacteria, our findings may have implications for CRC prevention.
Schizophrenia genome-wide association studies (GWAS) have identified common SNPs, rare copy number variants (CNVs) and a large polygenic contribution to illness risk, but biological mechanisms remain unclear. Bioinformatic analyses of significantly associated genetic variants point to a large role for regulatory variants. To identify gene expression abnormalities in schizophrenia, we generated whole-genome gene expression profiles using microarrays on lymphoblastoid cell lines (LCLs) from 413 cases and 446 controls. Regression analysis identified 95 transcripts differentially expressed by affection status at a genome-wide false discovery rate (FDR) of 0.05, while simultaneously controlling for confounding effects. These transcripts represented 89 genes with functions such as neurotransmission, gene regulation, cell cycle progression, differentiation, apoptosis, microRNA (miRNA) processing and immunity. This functional diversity is consistent with schizophrenia's likely significant pathophysiological heterogeneity. The overall enrichment of immune-related genes among those differentially expressed by affection status is consistent with hypothesized immune contributions to schizophrenia risk. The observed differential expression of extended major histocompatibility complex (xMHC) region histones (HIST1H2BD, HIST1H2BC, HIST1H2BH, HIST1H2BG and HIST1H4K) converges with the genetic evidence from GWAS, which find the xMHC to be the most significant susceptibility locus. Among the differentially expressed immune-related genes, B3GNT2 is implicated in autoimmune disorders previously tied to schizophrenia risk (rheumatoid arthritis and Graves’ disease), and DICER1 is pivotal in miRNA processing potentially linking to miRNA alterations in schizophrenia (e.g. MIR137, the second strongest GWAS finding). Our analysis provides novel candidate genes for further study to assess their potential contribution to schizophrenia.
Although CDKN2A is the most frequent high-risk melanoma susceptibility gene, the underlying genetic factors for most melanoma-prone families remain unknown. Using whole exome sequencing, we identified a rare variant that arose as a founder mutation in the telomere shelterin POT1 gene (g.7:124493086 C>T, Ser270Asn) in five unrelated melanoma-prone families from Romagna, Italy. Carriers of this variant had increased telomere length and elevated fragile telomeres suggesting that this variant perturbs telomere maintenance. Two additional rare POT1 variants were identified in all cases sequenced in two other Italian families, yielding a frequency of POT1 variants comparable to that of CDKN2A mutations in this population. These variants were not found in public databases or in 2,038 genotyped Italian controls. We also identified two rare recurrent POT1 variants in American and French familial melanoma cases. Our findings suggest that POT1 is a major susceptibility gene for familial melanoma in several populations.
Bacteria affect oral health, but few studies have systematically examined the role of bacterial communities in oral diseases. We examined this relationship in a large population-based Chinese cancer screening cohort.
Human Oral Microbe Identification Microarrays were used to test for the presence of 272 human oral bacterial species (97 genera) in upper digestive tract (UDT) samples collected from 659 participants. Oral health was assessed using US NHANES (National Health and Nutrition Examination Survey) protocols. We assessed both dental health (total teeth missing; tooth decay; and the decayed, missing, and filled teeth (DMFT) score) and periodontal health (bleeding on probing (BoP) extent score, loss of attachment extent score, and a periodontitis summary estimate).
Microbial richness, estimated by number of genera per sample, was positively correlated with BoP score (P = 0.015), but negatively correlated with tooth decay and DMFT score (P = 0.008 and 0.022 respectively). Regarding β-diversity, as estimated by the UniFrac distance matrix for pairwise differences among samples, at least one of the first three principal components of the UniFrac distance matrix was correlated with the number of missing teeth, tooth decay, DMFT, BoP, or periodontitis. Of the examined genera, Parvimonas was positively associated with BoP and periodontitis. Veillonellacease [G-1] was associated with a high DMFT score, and Filifactor and Peptostreptococcus were associated with a low DMFT score.
Our results suggest distinct relationships between UDT microbiota and dental and periodontal health. Poor dental health was associated with a less microbial diversity, whereas poor periodontal health was associated with more diversity and the presence of potentially pathogenic species.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2458-14-1110) contains supplementary material, which is available to authorized users.
Microbiota; Oral health; Dental caries; Periodontitis; Bleeding on probe; Attachment loss
The genetic regulation of the human epigenome is not fully appreciated. Here we describe the effects of genetic variants on the DNA methylome in human lung based on methylation-quantitative trait loci (meQTL) analyses. We report 34,304 cis- and 585 trans-meQTLs, a genetic-epigenetic interaction of surprising magnitude, including a regulatory hotspot. These findings are replicated in both breast and kidney tissues and show distinct patterns: cis-meQTLs mostly localize to CpG sites outside of genes, promoters, and CpG islands (CGIs), while trans-meQTLs are over-represented in promoter CGIs. meQTL SNPs are enriched in CTCF binding sites, DNaseI hypersensitivity regions and histone marks. Importantly, 4 of the 5 established lung cancer risk loci in European ancestry are cis-meQTLs and, in aggregate, cis-meQTLs are enriched for lung cancer risk in a genome-wide analysis of 11,587 subjects. Thus, inherited genetic variation may affect lung carcinogenesis by regulating the human methylome.
Approaches exploiting extremes of the trait distribution may reveal novel loci for common traits, but it is unknown whether such loci are generalizable to the general population. In a genome-wide search for loci associated with upper vs. lower 5th percentiles of body mass index, height and waist-hip ratio, as well as clinical classes of obesity including up to 263,407 European individuals, we identified four new loci (IGFBP4, H6PD, RSRC1, PPP2R2A) influencing height detected in the tails and seven new loci (HNF4G, RPTOR, GNAT2, MRPS33P4, ADCY9, HS6ST3, ZZZ3) for clinical classes of obesity. Further, we show that there is large overlap in terms of genetic structure and distribution of variants between traits based on extremes and the general population and little etiologic heterogeneity between obesity subgroups.
Copy number variations (CNVs) constitute a major source of genetic variations in human populations and have been reported to be associated with complex diseases. Methods have been developed for detecting CNVs and testing CNV associations in genome-wide association studies (GWAS) based on SNP arrays. Commonly used two-step testing procedures work well only for long CNVs while direct CNV association testing methods work only for recurrent CNVs. Assuming that short CNVs disrupting any part of a given genomic region increase disease risk, we developed a variable threshold exact test (VTET) for testing disease associations of CNVs randomly distributed in the genome using intensity data from SNP arrays. By extensive simulations, we found that VTET outperformed two-step testing procedures based on existing CNV calling algorithms for short CNVs and that the performance of VTET was robust to the length of the genomic region. In addition, VTET had a comparable performance with CNVtools for testing the association of recurrent CNVs. Thus, we expect VTET to be useful for testing disease associations of both recurrent and randomly distributed CNVs using existing GWAS data. We applied VTET to a lung cancer GWAS and identified a genome-wide significant region on chromosome 18q22.3 for lung squamous cell carcinoma.
copy number varination; variable threshold exact test; genome-wide association study; interval-based association test; lung cancer CNV analysis
Multiple sources of evidence suggest that genetic factors influence variation in clinical features of schizophrenia. The authors present the first genome-wide association study (GWAS) of dimensional symptom scores among individuals with schizophrenia.
Based on the Lifetime Dimensions of Psychosis Scale ratings of 2,454 case subjects of European ancestry from the Molecular Genetics of Schizophrenia (MGS) sample, three symptom factors (positive, negative/disorganized, and mood) were identified with exploratory factor analysis. Quantitative scores for each factor from a confirmatory factor analysis were analyzed for association with 696,491 single-nucleotide polymorphisms (SNPs) using linear regression, with correction for age, sex, clinical site, and ancestry. Polygenic score analysis was carried out to determine whether case and comparison subjects in 16 Psychiatric GWAS Consortium (PGC) schizophrenia samples (excluding MGS samples) differed in scores computed by weighting their genotypes by MGS association test results for each symptom factor.
No genome-wide significant associations were observed between SNPs and factor scores. Most of the SNPs producing the strongest evidence for association were in or near genes involved in neurodevelopment, neuroprotection, or neurotransmission, including genes playing a role in Mendelian CNS diseases, but no statistically significant effect was observed for any defined gene pathway. Finally, polygenic scores based on MGS GWAS results for the negative/disorganized factor were significantly different between case and comparison subjects in the PGC data set; for MGS subjects, negative/ disorganized factor scores were correlated with polygenic scores generated using case-control GWAS results from the other PGC samples.
The polygenic signal that has been observed in cross-sample analyses of schizophrenia GWAS data sets could be in part related to genetic effects on negative and disorganized symptoms (i.e., core features of chronic schizophrenia).
Pulmonary inflammation may contribute to lung cancer etiology. We conducted a broad evaluation of the association of single nucleotide polymorphisms (SNPs) in innate immunity and inflammation pathways with lung cancer risk, and conducted comparisons with a lung cancer genome wide association study (GWAS).
We included 378 lung cancer cases and 450 controls from the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial. An Illumina GoldenGate oligonucleotide pool assay was used to genotype 1,429 SNPs. Odds ratios (ORs) and 95% confidence intervals (CIs) were estimated for each SNP, and p-values for trend were calculated. For statistically significant SNPs (p-trend<0.05), we replicated our results with genotyped or imputed SNPs in the GWAS, and adjusted p-values for multiple testing.
In our PLCO analysis, we observed a significant association between 81 SNPs located in 44 genes and lung cancer (p-trend<0.05). Of these 81 SNPS, there was evidence for confirmation in the GWAS for 10 SNPs. However, after adjusting for multiple comparisons, the only SNP that remained significantly associated with lung cancer in the replication phase was rs4648127 (NFKB1; multiple testing adjusted p-trend=0.02). The CT/TT genotype of NFKB1 was associated with reduced odds of lung cancer in the PLCO study (OR=0.56; 95% CI 0.37–0.86) and the GWAS (OR=0.79; 95% CI 0.69–0.90).
We found a significant association between a variant in the NFKB1 gene and lung cancer risk. Our findings add to evidence implicating inflammation and immunity in lung cancer etiology.
lung cancer; genetics; inflammation; immunity; epidemiology
Large genomic copy number variations (CNVs) have been implicated as strong risk factors for schizophrenia. However, the rarity of these events has created challenges for the identification of further pathogenic loci, and extremely large samples are required to provide convincing replication.
To detect novel CNVs increasing susceptibility to schizophrenia, utilizing two ethnically homogeneous discovery cohorts and replication in large samples.
Genetic association study of microarray data.
DNA samples were collected at nine sites from different countries.
Two discovery cohorts were comprised of: a) 790 cases (schizophrenia and schizoaffective disorder) and 1347 controls of Ashkenazi Jewish descent; and b) 662 trios (offspring affected with schizophrenia or schizoaffective disorder) from Bulgaria. Replication datasets consisted of 12,398 cases and 17,945 controls.
Main outcome measure
Statistically increased rate of specific CNVs in cases versus controls.
One novel locus was implicated: a deletion at distal 16p11.2, which does not overlap the proximal 16p11.2 locus previously reported in schizophrenia and autism. Deletions at this locus were found in 13 out of 13,850 cases (0.094%) and in 3 out of 19,954 controls (0.015%), Fisher Exact p = 0.0014; OR = 6.25 (95%CI = 1.78 – 21.93).
Deletions at distal 16p11.2 have been previously implicated in developmental delay and obesity. The region contains nine genes, several of which are implicated in neurological diseases, regulation of body weight, and glucose homeostasis. A telomeric extension of the deletion, observed in about half the cases but no controls, potentially implicates an additional eight genes. Our findings add a new locus to the list of CNVs that increase risk to develop schizophrenia.
We conducted a pilot study of reproducibility and associations of microbial diversity and composition in fecal microbial DNA.
Methods and results
Participants (25 men, 26 women, ages 17–65 years) provided questionnaire data and multiple samples of one stool collected with two Polymedco and two Sarstedt devices pre-loaded with RNAlater. 16S rRNA genes in each fecal DNA aliquot were amplified, sequenced (Roche/454 Life Sciences), and assigned to taxa. Devices were compared for ease of use and reproducibility [intraclass correlation coefficient (ICC)] between duplicate aliquots on diversity and taxonomic assignment. Associations were tested by linear regression. Both collection devices were easy to use. Both alpha diversity (Shannon index) and beta diversity (UniFrac) were higher between than within duplicates (P≤10−8) and did not differ significantly by device (P≥0.62). Reproducibility was good (ICC ≥0.77) for alpha diversity and taxonomic assignment to the most abundant phyla, Firmicutes and Bacteroidetes (71.5% and 25.0% of sequences, respectively), but reproducibility was low (ICC≤0.48) for less abundant taxa. Alpha diversity was lower with non-antibiotic prescription medication (P=0.02), younger age (P=0.03) and marginally with higher body mass index (P=0.08).
With sampling from various parts of a stool, both devices provided good reproducibility on overall microbial diversity and classification for the major phyla, but not for minor phyla. Implementation of these methods should provide insights on how broad microbial parameters, but not necessarily rare microbes, affect risk for various conditions.
Microbiome; alpha diversity; beta diversity; bacterial phylogenetics; medications; body mass index
The intestinal microbial community has major effects on human health, but optimal research methods are unsettled. To facilitate epidemiologic and clinical research, we sought to optimize conditions and to assess reproducibility of selected core functions of the distal gut microbiota, β-glucuronidase and β-glucosidase bioactivities.
Methods and results
A colorimetric kinetic method was optimized and used to quantify activities of β-glucuronidase and β-glucosidase in human feces. Enzyme detection was optimal with neutral pH, snap freezing in liquid nitrogen, and rapid thawing to 37°C before protein extraction. Enzymatic stability was assessed by delayed freezing for 2–48 hours to mimic field settings. Activities decayed approximately 20% within 2 hours and 40% within 4 hours at room temperature. To formally assess reproducibility, 51 volunteers (25 male; mean age 39) used two devices to self-collect and rapidly chill four replicates of a stool. Devices were compared for mean enzymatic activities and intraclass correlation coefficients (ICC) in paired replicates of the self-collected specimens. Reproducibility was excellent with both devices for β-glucuronidase (ICC 0.92). The larger collection device had significantly higher reproducibility for β-glucosidase (ICC 0.92 vs. 0.76, P<0.0001) and higher mean activities for both enzymes (P<0.0001).
Optimal measurement of these core activities of the microbiota required a sufficient quantity of rapidly chilled or frozen specimens collected in PBS at pH7.0. Application of these methods to clinical and epidemiologic research could provide insights on how the intestinal microbiota affects human health.
β-glucuronidase activity; β-glucosidase activity; feces; reproducibility
Recent studies have shown an association between cigarettes per day (CPD) and a nonsynonymous single-nucleotide polymorphism in CHRNA5, rs16969968.
To determine whether the association between rs16969968 and smoking is modified by age at onset of regular smoking.
Available genetic studies containing measures of CPD and the genotype of rs16969968 or its proxy.
Uniform statistical analysis scripts were run locally. Starting with 94 050 ever-smokers from 43 studies, we extracted the heavy smokers (CPD >20) and light smokers (CPD ≤10) with age-at-onset information, reducing the sample size to 33 348. Each study was stratified into early-onset smokers (age at onset ≤16 years) and late-onset smokers (age at onset >16 years), and a logistic regression of heavy vs light smoking with the rs16969968 genotype was computed for each stratum. Meta-analysis was performed within each age-at-onset stratum.
Individuals with 1 risk allele at rs16969968 who were early-onset smokers were significantly more likely to be heavy smokers in adulthood (odds ratio [OR]=1.45; 95% CI, 1.36–1.55; n=13 843) than were carriers of the risk allele who were late-onset smokers (OR = 1.27; 95% CI, 1.21–1.33, n = 19 505) (P = .01).
These results highlight an increased genetic vulnerability to smoking in early-onset smokers.
Given the anthropometric differences between men and women and previous evidence of sex-difference in genetic effects, we conducted a genome-wide search for sexually dimorphic associations with height, weight, body mass index, waist circumference, hip circumference, and waist-to-hip-ratio (133,723 individuals) and took forward 348 SNPs into follow-up (additional 137,052 individuals) in a total of 94 studies. Seven loci displayed significant sex-difference (FDR<5%), including four previously established (near GRB14/COBLL1, LYPLAL1/SLC30A10, VEGFA, ADAMTS9) and three novel anthropometric trait loci (near MAP3K1, HSD17B4, PPARG), all of which were genome-wide significant in women (P<5×10−8), but not in men. Sex-differences were apparent only for waist phenotypes, not for height, weight, BMI, or hip circumference. Moreover, we found no evidence for genetic effects with opposite directions in men versus women. The PPARG locus is of specific interest due to its role in diabetes genetics and therapy. Our results demonstrate the value of sex-specific GWAS to unravel the sexually dimorphic genetic underpinning of complex traits.
Men and women differ substantially regarding height, weight, and body fat. Interestingly, previous work detecting genetic effects for waist-to-hip ratio, to assess body fat distribution, has found that many of these showed sex-differences. However, systematic searches for sex-differences in genetic effects have not yet been conducted. Therefore, we undertook a genome-wide search for sexually dimorphic genetic effects for anthropometric traits including 133,723 individuals in a large meta-analysis and followed promising variants in further 137,052 individuals, including a total of 94 studies. We identified seven loci with significant sex-difference including four previously established (near GRB14/COBLL1, LYPLAL1/SLC30A10, VEGFA, ADAMTS9) and three novel anthropometric trait loci (near MAP3K1, HSD17B4, PPARG), all of which were significant in women, but not in men. Of interest is that sex-difference was only observed for waist phenotypes, but not for height or body-mass-index. We found no evidence for sex-differences with opposite effect direction for men and women. The PPARG locus is of specific interest due to its link to diabetes genetics and therapy. Our findings demonstrate the importance of investigating sex differences, which may lead to a better understanding of disease mechanisms with a potential relevance to treatment options.
Recent meta-analyses of European ancestry subjects show strong evidence for association between smoking quantity and multiple genetic variants on chromosome 15q25. This meta-analysis extends the examination of association between distinct genes in the CHRNA5-CHRNA3-CHRNB4 region and smoking quantity to Asian and African American populations to confirm and refine specific reported associations.
Association results for a dichotomized cigarettes smoked per day (CPD) phenotype in 27 datasets (European ancestry (N=14,786), Asian (N=6,889), and African American (N=10,912) for a total of 32,587 smokers) were meta-analyzed by population and results were compared across all three populations.
We demonstrate association between smoking quantity and markers in the chromosome 15q25 region across all three populations, and narrow the region of association. Of the variants tested, only rs16969968 is associated with smoking (p < 0.01) in each of these three populations (OR=1.33, 95%C.I.=1.25–1.42, p=1.1×10−17 in meta-analysis across all population samples). Additional variants displayed a consistent signal in both European ancestry and Asian datasets, but not in African Americans.
The observed consistent association of rs16969968 with heavy smoking across multiple populations, combined with its known biological significance, suggests rs16969968 is most likely a functional variant that alters risk for heavy smoking. We interpret additional association results that differ across populations as providing evidence for additional functional variants, but we are unable to further localize the source of this association. Using the cross-population study paradigm provides valuable insights to narrow regions of interest and inform future biological experiments.
smoking; genetics; meta-analysis; cross-population
p53 is critical in regulating the differentiation of ES and induced pluripotent stem (iPS) cells. Here, we report a whole-genome study of p53-mediated DNA damage signaling in mouse ES cells. Systems analyses reveal that binding of p53 at promoter region significantly correlates with gene activation but not with repression. Unexpectedly, we identify a regulatory mode for p53-mediated repression through interfering with distal enhancer activity. Importantly, many ES cell-enriched core transcription factors are p53-repressed genes. Further analyses demonstrate that p53-repressed genes are functionally associated with ES/iPS cell status while p53-activated genes are linked to differentiation. p53-activated genes and -repressed genes also display distinguishable features of expression levels and epigenetic markers. Upon DNA damage, p53 regulates the self-renewal and pluripotency of ES cells. Together, these results support a model that, in response to DNA damage, p53 affects the status of ES cells through activating differentiation-associated genes and repressing ES cell-enriched genes.
embryonic stem cells; p53; genomics; epigenetics; transcription
The extent to which RNA stability differs between individuals and its contribution to the interindividual expression variation remain unknown. We conducted a genome-wide analysis of RNA stability in seven human HapMap lymphoblastoid cell lines (LCLs) and analyzed the effect of DNA sequence variation on RNA half-life differences. Twenty-six percent of the expressed genes exhibited RNA half-life differences between LCLs at a false discovery rate (FDR) < 0.05, which accounted for ~ 37% of the gene expression differences between individuals. Nonsense polymorphisms were associated with reduced RNA half-lives. In genes presenting interindividual RNA half-life differences, higher coding GC3 contents (G and C percentages at the third-codon positions) were correlated with increased RNA half-life. Consistently, G and C alleles of single nucleotide polymorphisms (SNPs) in protein coding sequences were associated with enhanced RNA stability. These results suggest widespread interindividual differences in RNA stability related to DNA sequence and composition variation.
High systemic estrogen levels contribute to breast cancer risk for postmenopausal women, whereas low levels contribute to osteoporosis risk. Except for obesity, determinants of non-ovarian systemic estrogen levels are undefined. We sought to identify members and functions of the intestinal microbial community associated with estrogen levels via enterohepatic recirculation.
Fifty-one epidemiologists at the National Institutes of Health, including 25 men, 7 postmenopausal women, and 19 premenopausal women, provided urine and aliquots of feces, using methods proven to yield accurate and reproducible results. Estradiol, estrone, 13 estrogen metabolites (EM), and their sum (total estrogens) were quantified in urine and feces by liquid chromatography/tandem mass spectrometry. In feces, β-glucuronidase and β-glucosidase activities were determined by realtime kinetics, and microbiome diversity and taxonomy were estimated by pyrosequencing 16S rRNA amplicons. Pearson correlations were computed for each loge estrogen level, loge enzymatic activity level, and microbiome alpha diversity estimate. For the 55 taxa with mean relative abundance of at least 0.1%, ordinal levels were created [zero, low (below median of detected sequences), high] and compared to loge estrogens, β-glucuronidase and β-glucosidase enzymatic activity levels by linear regression. Significance was based on two-sided tests with α=0.05.
In men and postmenopausal women, levels of total urinary estrogens (as well as most individual EM) were very strongly and directly associated with all measures of fecal microbiome richness and alpha diversity (R≥0.50, P≤0.003). These non-ovarian systemic estrogens also were strongly and significantly associated with fecal Clostridia taxa, including non-Clostridiales and three genera in the Ruminococcaceae family (R=0.57−0.70, P=0.03−0.002). Estrone, but not other EM, in urine correlated significantly with functional activity of fecal β-glucuronidase (R=0.36, P=0.04). In contrast, fecal β-glucuronidase correlated inversely with fecal total estrogens, both conjugated and deconjugated (R≤-0.47, P≤0.01). Premenopausal female estrogen levels, which were collected across menstrual cycles and thus highly variable, were completely unrelated to fecal microbiome and enzyme parameters (P≥0.6).
Intestinal microbial richness and functions, including but not limited to β-glucuronidase, influence levels of non-ovarian estrogens via enterohepatic circulation. Thus, the gut microbial community likely affects the risk for estrogen-related conditions in older adults. Understanding how Clostridia taxa relate to systemic estrogens may identify targets for interventions.
Microbiome; Feces; Enterohepatic circulation; β-glucuronidase; β-glucosidase; Postmenopausal estrogens; Fecal estrogens; Estrogen metabolites
Tobacco-induced lung cancer is characterized by a deregulated inflammatory microenvironment. Variants in multiple genes in inflammation pathways may contribute to risk of lung cancer.
We therefore conducted a three-stage comprehensive pathway analysis (discovery, replication and meta-analysis) of inflammation gene variants in ever smoking lung cancer cases and controls. A discovery set (1096 cases; 727 controls) and an independent and non-overlapping internal replication set (1154 cases; 1137 controls) were derived from an ongoing case-control study. For discovery, we used an iSelect BeadChip to interrogate a comprehensive panel of 11737 inflammation pathway SNPs and selected nominally significant (p<0.05) SNPs for internal replication.
There were 6 SNPs that achieved statistical significance (p<0.05) in the internal replication dataset with concordant risk estimates for former smokers and 5 concordant and replicated SNPs in current smokers. Replicated hits were further tested in a subsequent meta-analysis using external data derived from two published GWAS and a case-control study. Two of these variants (a BCL2L14 SNP in former smokers and a SNP in IL2RB in current smokers) were further validated. In risk score analyses, there was a 26% increase in risk with each additional adverse allele when we combined the genotyped SNP and the most significant imputed SNP in IL2RB in current smokers and a 36% similar increase in risk for former smokers associated with genotyped and imputed BCL2L14 SNPs.
Before they can be applied for risk prediction efforts, these SNPs should be subject to further external replication and more extensive fine mapping studies.
Inflammation SNPS; lung cancer; smokers
Meta-analysis of genome-wide association studies involves testing single nucleotide polymorphisms (SNPs) using summary statistics that are weighted sums of site-specific score or Wald statistics. This approach avoids having to pool individual-level data. We describe the weights that maximize the power of the summary statistics. For small effect-sizes, any choice of weights yields summary Wald and score statistics with the same power, and the optimal weights are proportional to the square roots of the sites' Fisher information for the SNP's regression coefficient. When SNP effect size is constant across sites, the optimal summary Wald statistic is the well-known inverse-variance-weighted combination of estimated regression coefficients, divided by its standard deviation. We give simple approximations to the optimal weights for various phenotypes, and show that weights proportional to the square roots of study sizes are suboptimal for data from case-control studies with varying case-control ratios, for quantitative trait data when the trait variance differs across sites, for count data when the site-specific mean counts differ, and for survival data with different proportions of failing subjects. Simulations suggest that weights that accommodate inter-site variation in imputation error give little power gain compared to those obtained ignoring imputation uncertainties. We note advantages to combining site-specific score statistics, and we show how they can be used to assess effect-size heterogeneity across sites. The utility of the summary score statistic is illustrated by application to a meta-analysis of schizophrenia data in which only site-specific p-values and directions of association are available.
combining GWAS; effect-size heterogeneity; meta-analysis; noncentrality parameter; score statistics; Wald statistics
While lung cancer is largely caused by tobacco smoking, inherited genetic factors play a role in its etiology. Genome-wide association studies (GWAS) in Europeans have robustly demonstrated only three polymorphic variations influencing lung cancer risk. Tumor heterogeneity may have hampered the detection of association signal when all lung cancer subtypes were analyzed together. In a GWAS of 5,355 European smoking lung cancer cases and 4,344 smoking controls, we conducted a pathway-based analysis in lung cancer histologic subtypes with 19,082 SNPs mapping to 917 genes in the HuGE-defined “inflammation” pathway. We identified a susceptibility locus for squamous cell lung carcinoma (SQ) at 12p13.33 (RAD52, rs6489769), and replicated the association in three independent samples totaling 3,359 SQ cases and 9,100 controls (odds ratio=1.20, Pcombined=2.3×10−8).
The combination of pathway-based approaches and information on disease specific subtypes can improve the identification of cancer susceptibility loci in heterogeneous diseases.
Lung cancer; histology; squamous cell carcinoma; pathway analysis; RAD52
Genome-wide association studies (GWAS) of body mass index (BMI) using large samples have yielded approximately a dozen robustly associated variants and implicated additional loci. Individually these variants have small effects and in aggregate explain a small proportion of the variance. As a result, replication attempts have limited power to achieve genome-wide significance, even with several thousand subjects. Since there is strong prior evidence for genetic influence on BMI for specific variants, alternative approaches to replication can be applied. Instead of testing individual loci sequentially, a genetic risk sum score (GRSS) summarizing the total number of risk alleles can be tested. In the current study, GRSS comprising 56 top variants catalogued from two large meta-analyses was tested for association with BMI in the Molecular Genetics of Schizophrenia controls (2,653 European-Americans, 973 African-Americans). After accounting for covariates known to influence BMI (ancestry, sex, age), GRSS was highly associated with BMI (p value = 3.19E−06) although explained a limited amount of the variance (0.66%). However, area under receiver operator criteria curve (AUC) estimates indicated that the GRSS and covariates significantly predicted overweight and obesity classification with maximum discriminative ability for predicting class III obesity (AUC = 0.697). The relative contributions of the individual loci to GRSS were examined post hoc and the results were not due to a few highly significant variants, but rather the result of numerous variants of small effect. This study provides evidence of the utility of a GRSS as an alternative approach to replication of common polygenic variation in complex traits.
Few microbial functions have been compared to a comprehensive survey of the human fecal microbiome. We evaluated determinants of fecal microbial β-glucuronidase and β-glucosidase activities, focusing especially on associations with microbial alpha and beta diversity and taxonomy. We enrolled 51 healthy volunteers (26 female, mean age 39) who provided questionnaire data and multiple aliquots of a stool, from which proteins were extracted to quantify β-glucuronidase and β-glucosidase activities, and DNA was extracted to amplify and pyrosequence 16S rRNA gene sequences to classify and quantify microbiome diversity and taxonomy. Fecal β-glucuronidase was elevated with weight loss of at least 5 lb. (P = 0.03), whereas β-glucosidase was marginally reduced in the four vegetarians (P = 0.06). Both enzymes were correlated directly with microbiome richness and alpha diversity measures, directly with the abundance of four Firmicutes Clostridia genera, and inversely with the abundance of two other genera (Firmicutes Lactobacillales Streptococcus and Bacteroidetes Rikenellaceae Alistipes) (all P = 0.05–0.0001). Beta diversity reflected the taxonomic associations. These observations suggest that these enzymatic functions are performed by particular taxa and that diversity indices may serve as surrogates of bacterial functions. Independent validation and deeper understanding of these associations are needed, particularly to characterize functions and pathways that may be amenable to manipulation.