Schizophrenia is a common complex disorder with polygenic inheritance. Here we show that by using an approach that compares the individual loads of rare variants in 1,042 schizophrenia cases and 961 controls, schizophrenia cases carry an increased burden of deleterious mutations. At a genome-wide level, our results implicate non-synonymous, splice site as well as stop-altering single-nucleotide variations occurring at minor allele frequency of ≥0.01% in the population. In an independent replication sample of 5,585 schizophrenia cases and 8,103 controls of European ancestry we confirm an enrichment in cases of the alleles identified in our study. In addition, the genes implicated by the increased burden of rare coding variants highlight the involvement of neurodevelopment in the aetiology of schizophrenia.
Schizophrenia is a complex disorder with high heritability but poorly understood genetics. Here Olde Loohuis et al. compare schizophrenia patients to unaffected individuals and identify an increased individual burden of rare deleterious mutations in patients.
We report here the first genome-wide high-resolution polymorphism resource for non-human primate (NHP) association and linkage studies, constructed for the Caribbean-origin vervet monkey, or African green monkey (Chlorocebus aethiops sabaeus), one of the most widely used NHPs in biomedical research. We generated this resource by whole genome sequencing (WGS) of monkeys from the Vervet Research Colony (VRC), an NIH-supported research resource for which extensive phenotypic data are available.
We identified genome-wide single nucleotide polymorphisms (SNPs) by WGS of 721 members of an extended pedigree from the VRC. From high-depth WGS data we identified more than 4 million polymorphic unequivocal segregating sites; by pruning these SNPs based on heterozygosity, quality control filters, and the degree of linkage disequilibrium (LD) between SNPs, we constructed genome-wide panels suitable for genetic association (about 500,000 SNPs) and linkage analysis (about 150,000 SNPs). To further enhance the utility of these resources for linkage analysis, we used a further pruned subset of the linkage panel to generate multipoint identity by descent matrices.
The genetic and phenotypic resources now available for the VRC and other Caribbean-origin vervets enable their use for genetic investigation of traits relevant to human diseases.
Electronic supplementary material
The online version of this article (doi:10.1186/s12915-015-0152-2) contains supplementary material, which is available to authorized users.
Vervet; Non-human primate; Whole genome sequencing; SNP; Linkage; Association
Genetic factors contribute to risk for bipolar disorder (BP), yet its
pathogenesis remains poorly understood. A focus on measuring multi-system
quantitative traits that may be components of BP psychopathology may enable
genetic dissection of this complex disorder, and investigation of extended
pedigrees from genetically isolated populations may facilitate the detection
of specific genetic variants that impact on BP as well as its component
To identify quantitative neurocognitive, temperament-related, and
neuroanatomic phenotypes that appear heritable and associated with severe
bipolar disorder (BP-I), and therefore suitable for genetic linkage and
association studies aimed at identifying variants contributing to BP-I
Multi-generational pedigree study in two closely related, genetically
isolated populations: the Central Valley of Costa Rica (CVCR) and Antioquia,
738 individuals, all from CVCR and ANT pedigrees, of whom 181 are
affected with BP-I.
MAIN OUTCOME MEASURE
Familial aggregation (heritability) and association with BP-I of 169
quantitative neurocognitive, temperament, magnetic resonance imaging (MRI)
and diffusion tensor imaging (DTI) phenotypes.
Seventy-five percent (126) of the phenotypes investigated were
significantly heritable, and 31% (53) were associated with BP-I.
About 1/4 of the phenotypes, including measures from each phenotype domain,
were both heritable and associated with BP-I. Neuroimaging phenotypes,
particularly cortical thickness in prefrontal and temporal regions, and
volume and microstructural integrity of the corpus callosum, represented the
most promising candidate traits for genetic mapping related to BP based on
strong heritability and association with disease. Analyses of phenotypic and
genetic covariation identified substantial correlations among the traits, at
least some of which share a common underlying genetic architecture.
CONCLUSIONS AND RELEVANCE
This is the most extensive investigation of BP-relevant component
phenotypes to date. Our results identify brain and behavioral quantitative
traits that appear to be genetically influenced and show a pattern of
BP-I-association within families that is consistent with expectations from
case-control studies. Together these phenotypes provide a basis for
identifying loci contributing to BP-I risk and for genetic dissection of the
Schizophrenia is a highly heritable disorder. Genetic risk is conferred by a large number of alleles, including common alleles of small effect that might be detected by genome-wide association studies. Here, we report a multi-stage schizophrenia genome-wide association study of up to 36,989 cases and 113,075 controls. We identify 128 independent associations spanning 108 conservatively defined loci that meet genome-wide significance, 83 of which have not been previously reported. Associations were enriched among genes expressed in brain providing biological plausibility for the findings. Many findings have the potential to provide entirely novel insights into aetiology, but associations at DRD2 and multiple genes involved in glutamatergic neurotransmission highlight molecules of known and potential therapeutic relevance to schizophrenia, and are consistent with leading pathophysiological hypotheses. Independent of genes expressed in brain, associations were enriched among genes expressed in tissues that play important roles in immunity, providing support for the hypothesized link between the immune system and schizophrenia.
To assess the utility of online patient self-report outcomes in a rare disease, we attempted to observe the effects of corticosteroids in delaying age at fulltime wheelchair use in Duchenne muscular dystrophy (DMD) using data from 1,057 males from DuchenneConnect, an online registry. Data collected were compared to prior natural history data in regard to age at diagnosis, mutation spectrum, and age at loss of ambulation. Because registrants reported differences in steroid and other medication usage, as well as age and ambulation status, we could explore these data for correlations with age at loss of ambulation. Using multivariate analysis, current steroid usage was the most significant and largest independent predictor of improved wheelchair-free survival. Thus, these online self-report data were sufficient to retrospectively observe that current steroid use by patients with DMD is associated with a delay in loss of ambulation. Comparing commonly used steroid drugs, deflazacort prolonged ambulation longer than prednisone (median 14 years and 13 years, respectively). Further, use of Vitamin D and Coenzyme Q10, insurance status, and age at diagnosis after 4 years were also significant, but smaller, independent predictors of longer wheelchair-free survival. Nine other common supplements were also individually tested but had lower study power. This study demonstrates the utility of DuchenneConnect data to observe therapeutic differences, and highlights needs for improvement in quality and quantity of patient-report data, which may allow exploration of drug/therapeutic practice combinations impractical to study in clinical trial settings. Further, with the low barrier to participation, we anticipate substantial growth in the dataset in the coming years.
As the availability of cost-effective high-throughput sequencing technology increases, genetic research is beginning to focus on identifying the contributions of rare variants (RVs) to complex traits. Using RVs to detect associated genes requires statistical approaches that mitigate the lack of power with the analysis of single RVs. Here we report the development and application of an approach that aggregates and evaluates the transmissions of RVs in parent-child trios. An initial score that incorporates the distortion in transmission of the observed RVs from the parents to their offspring is calculated for each variant. The scores are analyzed using a support vector machine that handles these data by mapping the transmission distortion of the multiple RVs into a one-dimensional score in a nonlinear fashion when parent-child trios with affected and nonaffected children are contrasted. We refer to this approach as Trio-SVM. A total of 275 trios were available in the Genetic Analysis Workshop 18 data for analysis. Because of their nonindependence and the extended linkage disequilibrium (LD) within pedigrees, Trio-SVM was vulnerable to type I errors in detecting association. Using the GAW18 data with simulated trait values, Trio-SVM has an appropriate type I error, but it lacks power with a sample of 267 trios. Larger samples of 500 to 1000 trios, derived from combining the simulated data, provided sufficient power. Two chromosome 3 candidate genes were tested in the real GAW18 data with Trio-SVM, and they showed marginal associations with hypertension.
Genetic Analysis Workshop 18 provided a platform for developing and evaluating statistical methods to analyze whole-genome sequence data from a pedigree-based sample. In this article we present an overview of the data sets and the contributions that analyzed these data. The family data, donated by the Type 2 Diabetes Genetic Exploration by Next-Generation Sequencing in Ethnic Samples Consortium, included sequence-level genotypes based on sequencing and imputation, genome-wide association genotypes from prior genotyping arrays, and phenotypes from longitudinal assessments. The contributions from individual research groups were extensively discussed before, during, and after the workshop in theme-based discussion groups before being submitted for publication.
The Mexican population and others with Amerindian heritage exhibit a substantial predisposition to dyslipidemias and coronary heart disease. Yet, these populations remain underinvestigated by genomic studies, and to date, no genome-wide association (GWA) studies have been reported for lipids in these rapidly expanding populations.
Methods and Findings
We performed a two-stage GWA study for hypertriglyceridemia and low high-density lipoprotein cholesterol (HDL-C) in Mexicans (n=4,361) and identified a novel Mexican-specific genome-wide significant locus for serum triglycerides (TGs) near the Niemann-Pick type C1 protein (NPC1) gene (P=2.43×10−08). Furthermore, three European loci for TGs (APOA5, GCKR, and LPL) and four loci for HDL-C (ABCA1, CETP, LIPC and LOC55908) reached genome-wide significance in Mexicans. We utilized cross-ethnic mapping to narrow three European TG GWA loci, APOA5, MLXIPL, and CILP2 that were wide and contained multiple candidate variants in the European scan. At the APOA5 locus, this reduced the most likely susceptibility variants to one, rs964184. Importantly, our functional analysis demonstrated a direct link between rs964184 and postprandial serum apoAV protein levels, supporting rs964184 as the causative variant underlying the European and Mexican GWA signal. Overall, 52 of the 100 reported associations from European lipid GWA meta-analysis generalized to Mexicans. However, in 82 of the 100 European GWA loci, a different variant other than the European lead/best-proxy variant had the strongest regional evidence of association in Mexicans.
This first Mexican GWA study of lipids identified a novel GWA locus for high TG levels; utilized the inter-population heterogeneity to significantly restrict three previously known European GWA signals; and surveyed whether the European lipid GWA SNPs extend to the Mexican population.
Given prior evidence for the contribution of rare copy number variations (CNVs) to autism spectrum disorders (ASD), we studied these events in 4,457 individuals from 1,174 simplex families, composed of parents, a proband and, in most kindreds, an unaffected sibling. We find significant association of ASD with de novo duplications of 7q11.23, where the reciprocal deletion causes Williams-Beuren syndrome, featuring a highly social personality. We identify rare recurrent de novo CNVs at five additional regions including two novel ASD loci, 16p13.2 (including the genes USP7 and C16orf72) and Cadherin13, and implement a rigorous new approach to evaluating the statistical significance of these observations. Overall, we find large de novo CNVs carry substantial risk (OR=3.55; CI =2.16-7.46, p=6.9 × 10−6); estimate the presence of 130-234 distinct ASD-related CNV intervals across the genome; and, based on data from multiple studies, present compelling evidence for the association of rare de novo events at 7q11.23, 15q11.2-13.1, 16p11.2, and Neurexin1.
Autism spectrum disorders (ASDs) are male-biased and genetically heterogeneous. While sequencing of sporadic cases has identified de novo risk variants, the heritable genetic contribution and mechanisms driving the male bias are less understood. Here, we aimed to identify familial and sex-differential risk loci in the largest available, uniformly ascertained, densely genotyped sample of multiplex ASD families from the Autism Genetics Resource Exchange (AGRE), and to compare results with earlier findings from AGRE.
From a total sample of 1,008 multiplex families, we performed genome-wide, non-parametric linkage analysis in a discovery sample of 847 families, and separately on subsets of families with only male, affected children (male-only, MO) or with at least one female, affected child (female-containing, FC). Loci showing evidence for suggestive linkage (logarithm of odds ≥2.2) in this discovery sample, or in previous AGRE samples, were re-evaluated in an extension study utilizing all 1,008 available families. For regions with genome-wide significant linkage signal in the discovery stage, those families not included in the corresponding discovery sample were then evaluated for independent replication of linkage. Association testing of common single nucleotide polymorphisms (SNPs) was also performed within suggestive linkage regions.
We observed an independent replication of previously observed linkage at chromosome 20p13 (P < 0.01), while loci at 6q27 and 8q13.2 showed suggestive linkage in our extended sample. Suggestive sex-differential linkage was observed at 1p31.3 (MO), 8p21.2 (FC), and 8p12 (FC) in our discovery sample, and the MO signal at 1p31.3 was supported in our expanded sample. No sex-differential signals met replication criteria, and no common SNPs were significantly associated with ASD within any identified linkage regions.
With few exceptions, analyses of subsets of families from the AGRE cohort identify different risk loci, consistent with extreme locus heterogeneity in ASD. Large samples appear to yield more consistent results, and sex-stratified analyses facilitate the identification of sex-differential risk loci, suggesting that linkage analyses in large cohorts are useful for identifying heritable risk loci. Additional work, such as targeted re-sequencing, is needed to identify the specific variants within these loci that are responsible for increasing ASD risk.
Male brain; Sex differences; Intermediate phenotype; Linkage analysis; Association; AGRE
The Enhancing NeuroImaging Genetics through Meta-Analysis (ENIGMA) Consortium is a collaborative network of researchers working together on a range of large-scale studies that integrate data from 70 institutions worldwide. Organized into Working Groups that tackle questions in neuroscience, genetics, and medicine, ENIGMA studies have analyzed neuroimaging data from over 12,826 subjects. In addition, data from 12,171 individuals were provided by the CHARGE consortium for replication of findings, in a total of 24,997 subjects. By meta-analyzing results from many sites, ENIGMA has detected factors that affect the brain that no individual site could detect on its own, and that require larger numbers of subjects than any individual neuroimaging study has currently collected. ENIGMA’s first project was a genome-wide association study identifying common variants in the genome associated with hippocampal volume or intracranial volume. Continuing work is exploring genetic associations with subcortical volumes (ENIGMA2) and white matter microstructure (ENIGMA-DTI). Working groups also focus on understanding how schizophrenia, bipolar illness, major depression and attention deficit/hyperactivity disorder (ADHD) affect the brain. We review the current progress of the ENIGMA Consortium, along with challenges and unexpected discoveries made on the way.
Genetics; MRI; GWAS; Consortium; Meta-analysis; Multi-site
Nonhuman primates (NHP) provide crucial biomedical model systems intermediate between rodents and humans. The vervet monkey (also called the African green monkey) is a widely used NHP model that has unique value for genetic and genomic investigations of traits relevant to human diseases. This article describes the phylogeny and population history of the vervet monkey and summarizes the use of both captive and wild vervet monkeys in biomedical research. It also discusses the effort of an international collaboration to develop the vervet monkey as the most comprehensively phenotypically and genomically characterized NHP, a process that will enable the scientific community to employ this model for systems biology investigations.
African green monkey; genetics; genomics; phenomics; simian immunodeficiency virus [SIV]; systems biology; transcriptomics; vervet
Immunoregulatory cytokine interleukin-10 (IL-10) is elevated in sera from patients with systemic lupus erythematosus (SLE) correlating with disease activity. The established association of IL10 with SLE and other autoimmune diseases led us to fine map causal variant(s) and to explore underlying mechanisms. We assessed 19 tag SNPs, covering the IL10 gene cluster including IL19, IL20 and IL24, for association with SLE in 15,533 case and control subjects from four ancestries. The previously reported IL10 variant, rs3024505 located at 1 kb downstream of IL10, exhibited the strongest association signal and was confirmed for association with SLE in European American (EA) (P = 2.7×10−8, OR = 1.30), but not in non-EA ancestries. SNP imputation conducted in EA dataset identified three additional SLE-associated SNPs tagged by rs3024505 (rs3122605, rs3024493 and rs3024495 located at 9.2 kb upstream, intron 3 and 4 of IL10, respectively), and SLE-risk alleles of these SNPs were dose-dependently associated with elevated levels of IL10 mRNA in PBMCs and circulating IL-10 protein in SLE patients and controls. Using nuclear extracts of peripheral blood cells from SLE patients for electrophoretic mobility shift assays, we identified specific binding of transcription factor Elk-1 to oligodeoxynucleotides containing the risk (G) allele of rs3122605, suggesting rs3122605 as the most likely causal variant regulating IL10 expression. Elk-1 is known to be activated by phosphorylation and nuclear localization to induce transcription. Of interest, phosphorylated Elk-1 (p-Elk-1) detected only in nuclear extracts of SLE PBMCs appeared to increase with disease activity. Co-expression levels of p-Elk-1 and IL-10 were elevated in SLE T, B cells and monocytes, associated with increased disease activity in SLE B cells, and were best downregulated by ERK inhibitor. Taken together, our data suggest that preferential binding of activated Elk-1 to the IL10 rs3122605-G allele upregulates IL10 expression and confers increased risk for SLE in European Americans.
Systemic lupus erythematosus (SLE), a debilitating autoimmune disease characterized by the production of pathogenic autoantibodies, has a strong genetic basis. Variants of the IL10 gene, which encodes cytokine interleukin-10 (IL-10) with known function of promoting B cell hyperactivity and autoantibody production, are associated with SLE and other autoimmune diseases, and serum IL-10 levels are elevated in SLE patients correlating with increased disease activity. In this study, to discover SLE-predisposing causal variant(s), we assessed variants within the genomic region containing IL10 and its gene family member IL19, IL20 and IL24 for association with SLE in case and control subjects from diverse ancestries. We identified SLE-associated SNP rs3122605 located at 9.2 kb upstream of IL10 as the most likely causal variant in subjects of European ancestry. The SLE-risk allele of rs3122605 was dose-dependently associated with elevated IL10 expression at both mRNA and protein levels in peripheral blood samples from SLE patients and controls, which could be explained, at least in part, by its preferential binding to Elk-1, a transcription factor activated in B cells during active disease of SLE patients. Elk-1-mediated IL-10 overexpression could be downregulated by inhibiting activation of mitogen-activated protein kinases, suggesting a potential therapeutic target for SLE.
Non-human primates provide genetic model systems biologically intermediate between humans and other mammalian model organisms. Populations of Caribbean vervet monkeys (Chlorocebus aethiops sabaeus) are genetically homogeneous and large enough to permit well-powered genetic mapping studies of quantitative traits relevant to human health, including expression quantitative trait loci (eQTL). Previous transcriptome-wide investigation in an extended vervet pedigree identified 29 heritable transcripts for which levels of expression in peripheral blood correlate strongly with expression levels in the brain. Quantitative trait linkage analysis using 261 microsatellite markers identified significant (n = 8) and suggestive (n = 4) linkages for 12 of these transcripts, including both cis- and trans-eQTL. Seven transcripts, located on different chromosomes, showed maximum linkage to markers in a single region of vervet chromosome 9; this observation suggests the possibility of a master trans-regulator locus in this region. For one cis-eQTL (at B3GALTL, beta-1,3-glucosyltransferase), we conducted follow-up single nucleotide polymorphism genotyping and fine-scale association analysis in a sample of unrelated Caribbean vervets, localizing this eQTL to a region of <200 kb. These results suggest the value of pedigree and population samples of the Caribbean vervet for linkage and association mapping studies of quantitative traits. The imminent whole genome sequencing of many of these vervet samples will enhance the power of such investigations by providing a comprehensive catalog of genetic variation.
The identification of autism susceptibility genes has been hampered by phenotypic heterogeneity of autism, among other factors. However, the use of endophenotypes has shown preliminary success in reducing heterogeneity and identifying potential autism-related susceptibility regions. To further explore the utility of using language related endophenotypes, we performed linkage analysis on multiplex autism families stratified according to delayed expressive speech and also assessed the extent to which parental phenotype information would aid in identifying regions of linkage. A whole genome scan using a multipoint nonparametric linkage approach was performed in 133 families, stratifying the sample by phrase speech delay and word delay. None of the regions reached suggested genome-wide or replication significance thresholds. However, several loci on chromosomes 1, 2, 4, 6, 7, 8, 9, 10, 12, 15, and 19 yielded nominally higher linkage signals in the delayed groups. The results did not support reported linkage findings for loci on chromosomes 7 or 13 that were a result of stratification based on the language delay endophenotype. In addition, inclusion of information on parental history of language delay did not appreciably affect the linkage results. The nominal increase in NPL scores across several regions using language delay endophenotypes for stratification suggests that this strategy may be useful in attenuating heterogeneity. However, the inconsistencies in regions identified across studies highlight the importance of increasing sample sizes to provide adequate power to test replications in independent samples.
Autism; linkage; endophenotypes; language; AGRE
Identifying genetic variants influencing human brain structures may reveal new biological mechanisms underlying cognition and neuropsychiatric illness. The volume of the hippocampus is a biomarker of incipient Alzheimer’s disease1,2 and is reduced in schizophrenia3, major depression4 and mesial temporal lobe epilepsy5. Whereas many brain imaging phenotypes are highly heritable6,7, identifying and replicating genetic influences has been difficult, as small effects and the high costs of magnetic resonance imaging (MRI) have led to underpowered studies. Here we report genome-wide association meta-analyses and replication for mean bilateral hippocampal, total brain and intracranial volumes from a large multinational consortium. The intergenic variant rs7294919 was associated with hippocampal volume (12q24.22; N = 21,151; P = 6.70 × 10−16) and the expression levels of the positional candidate gene TESC in brain tissue. Additionally, rs10784502, located within HMGA2, was associated with intracranial volume (12q14.3; N = 15,782; P = 1.12 × 10−12). We also identified a suggestive association with total brain volume at rs10494373 within DDR2 (1q23.3; N = 6,500; P = 5.81 × 10−7).
Autism spectrum disorder (ASD) is a common, highly heritable neuro-developmental condition characterized by marked genetic heterogeneity1–3. Thus, a fundamental question is whether autism represents an etiologically heterogeneous disorder in which the myriad genetic or environmental risk factors perturb common underlying molecular pathways in the brain4. Here, we demonstrate consistent differences in transcriptome organization between autistic and normal brain by gene co-expression network analysis. Remarkably, regional patterns of gene expression that typically distinguish frontal and temporal cortex are significantly attenuated in the ASD brain, suggesting abnormalities in cortical patterning. We further identify discrete modules of co-expressed genes associated with autism: a neuronal module enriched for known autism susceptibility genes, including the neuronal specific splicing factor A2BP1/FOX1, and a module enriched for immune genes and glial markers. Using high-throughput RNA-sequencing we demonstrate dysregulated splicing of A2BP1-dependent alternative exons in ASD brain. Moreover, using a published autism GWAS dataset, we show that the neuronal module is enriched for genetically associated variants, providing independent support for the causal involvement of these genes in autism. In contrast, the immune-glial module showed no enrichment for autism GWAS signals, indicating a non-genetic etiology for this process. Collectively, our results provide strong evidence for convergent molecular abnormalities in ASD, and implicate transcriptional and splicing dysregulation as underlying mechanisms of neuronal dysfunction in this disorder.
The Xq28 region containing IRAK1 and MECP2 has been identified as a risk locus for systemic lupus erythematosus (SLE) in previous genetic association studies. However, due to the strong linkage disequilibrium between IRAK1 and MECP2, it remains unclear which gene is affected by the underlying causal variant(s) conferring risk of SLE.
We fine-mapped ≥136 SNPs in a ~227kb region on Xq28, containing IRAK1, MECP2 and 7 adjacent genes (L1CAM, AVPR2, ARHGAP4, NAA10, RENBP, HCFC1 and TMEM187), for association with SLE in 15,783 case-control subjects derived from 4 different ancestral groups.
Multiple SNPs showed strong association with SLE in European Americans, Asians and Hispanics at P<5×10−8 with consistent association in subjects with African ancestry. Of these, 6 SNPs located in the TMEM187-IRAK1-MECP2 region captured the underlying causal variant(s) residing in a common risk haplotype shared by all 4 ancestral groups. Among them, rs1059702 best explained the Xq28 association signals in conditional testings and exhibited the strongest P value in trans-ancestral meta-analysis (Pmeta=1.3×10−27, OR=1.43), and thus was considered to be the most-likely causal variant. The risk allele of rs1059702 results in the amino acid substitution S196F in IRAK1 and had previously been shown to increase NF-κB activity in vitro. We also found that the homozygous risk genotype of rs1059702 was associated with lower mRNA levels of MECP2, but not IRAK1, in SLE patients (P=0.0012) and healthy controls (P=0.0064).
These data suggest contributions of both IRAK1 and MECP2 to SLE susceptibility.
Systemic Lupus Erythematosus; Gene Polymorphism; Xq28; IRAK1; MECP2
We previously reported that the G allele of rs3853839 at 3′untranslated region (UTR) of Toll-like receptor 7 (TLR7) was associated with elevated transcript expression and increased risk for systemic lupus erythematosus (SLE) in 9,274 Eastern Asians [P = 6.5×10−10, odds ratio (OR) (95%CI) = 1.27 (1.17–1.36)]. Here, we conducted trans-ancestral fine-mapping in 13,339 subjects including European Americans, African Americans, and Amerindian/Hispanics and confirmed rs3853839 as the only variant within the TLR7-TLR8 region exhibiting consistent and independent association with SLE (Pmeta = 7.5×10−11, OR = 1.24 [1.18–1.34]). The risk G allele was associated with significantly increased levels of TLR7 mRNA and protein in peripheral blood mononuclear cells (PBMCs) and elevated luciferase activity of reporter gene in transfected cells. TLR7 3′UTR sequence bearing the non-risk C allele of rs3853839 matches a predicted binding site of microRNA-3148 (miR-3148), suggesting that this microRNA may regulate TLR7 expression. Indeed, miR-3148 levels were inversely correlated with TLR7 transcript levels in PBMCs from SLE patients and controls (R2 = 0.255, P = 0.001). Overexpression of miR-3148 in HEK-293 cells led to significant dose-dependent decrease in luciferase activity for construct driven by TLR7 3′UTR segment bearing the C allele (P = 0.0003). Compared with the G-allele construct, the C-allele construct showed greater than two-fold reduction of luciferase activity in the presence of miR-3148. Reduced modulation by miR-3148 conferred slower degradation of the risk G-allele containing TLR7 transcripts, resulting in elevated levels of gene products. These data establish rs3853839 of TLR7 as a shared risk variant of SLE in 22,613 subjects of Asian, EA, AA, and Amerindian/Hispanic ancestries (Pmeta = 2.0×10−19, OR = 1.25 [1.20–1.32]), which confers allelic effect on transcript turnover via differential binding to the epigenetic factor miR-3148.
Systemic lupus erythematosus (SLE) is a debilitating autoimmune disease contributed to by excessive innate immune activation involving toll-like receptors (TLRs, particularly TLR7/8/9) and type I interferon (IFN) signaling pathways. TLR7 responds against RNA–containing nuclear antigens and activates IFN-α pathway, playing a pivotal role in the development of SLE. While a genomic duplication of Tlr7 promotes lupus-like disease in the Y-linked autoimmune accelerator (Yaa) murine model, the lack of common copy number variations at TLR7 in humans led us to identify a functional single nucleotide polymorphism (SNP), rs3853839 at 3′ UTR of the TLR7 gene, associated with SLE susceptibility in Eastern Asians. In this study, we fine-mapped the TLR7-TLR8 region and confirmed rs3853839 exhibiting the strongest association with SLE in European Americans, African Americans, and Amerindian/Hispanics. Individuals carrying the risk G allele of rs3853839 exhibited increased TLR7 expression at the both mRNA and protein level and decreased transcript degradation. MicroRNA-3148 (miR-3148) downregulated the expression of non-risk allele (C) containing transcripts preferentially, suggesting a likely mechanism for increased TLR7 levels in risk-allele carriers. This trans-ancestral mapping provides evidence for the global association with SLE risk at rs3853839, which resides in a microRNA–gene regulatory site affecting TLR7 expression.
Alternation of synaptic homeostasis is a biological process whose disruption might predispose children to autism spectrum disorders (ASD). Calcium channel genes (CCG) contribute to modulating neuronal function and evidence implicating CCG in ASD has been accumulating. We conducted a targeted association analysis of CCG using existing genome-wide association study (GWAS) data and imputation methods in a combined sample of parent/affected child trios from two ASD family collections to explore this hypothesis.
A total of 2,176 single-nucleotide polymorphisms (SNP) (703 genotyped and 1,473 imputed) covering the genes that encode the α1 subunit proteins of 10 calcium channels were tested for association with ASD in a combined sample of 2,781 parent/affected child trios from 543 multiplex Caucasian ASD families from the Autism Genetics Resource Exchange (AGRE) and 1,651 multiplex and simplex Caucasian ASD families from the Autism Genome Project (AGP). SNP imputation using IMPUTE2 and a combined reference panel from the HapMap3 and the 1,000 Genomes Project increased coverage density of the CCG. Family-based association was tested using the FBAT software which controls for population stratification and accounts for the non-independence of siblings within multiplex families. The level of significance for association was set at 2.3E-05, providing a Bonferroni correction for this targeted 10-gene panel.
Four SNPs in three CCGs were associated with ASD. One, rs10848653, is located in CACNA1C, a gene in which rare de novo mutations are responsible for Timothy syndrome, a Mendelian disorder that features ASD. Two others, rs198538 and rs198545, located in CACN1G, and a fourth, rs5750860, located in CACNA1I, are in CCGs that encode T-type calcium channels, genes with previous ASD associations.
These associations support a role for common CCG SNPs in ASD.
Autism spectrum disorders; Calcium channel genes; Common variants; Imputed SNPs; Association studies
High serum triglyceride (TG) levels is an established risk factor for coronary heart disease (CHD). Fat is stored in the form of TGs in human adipose tissue. We hypothesized that gene co-expression networks in human adipose tissue may be correlated with serum TG levels and help reveal novel genes involved in TG regulation.
Gene co-expression networks were constructed from two Finnish and one Mexican study sample using the blockwiseModules R function in Weighted Gene Co-expression Network Analysis (WGCNA). Overlap between TG-associated networks from each of the three study samples were calculated using a Fisher’s Exact test. Gene ontology was used to determine known pathways enriched in each TG-associated network.
We measured gene expression in adipose samples from two Finnish and one Mexican study sample. In each study sample, we observed a gene co-expression network that was significantly associated with serum TG levels. The TG modules observed in Finns and Mexicans significantly overlapped and shared 34 genes. Seven of the 34 genes (ARHGAP30, CCR1, CXCL16, FERMT3, HCST, RNASET2, SELPG) were identified as the key hub genes of all three TG modules. Furthermore, two of the 34 genes (ARHGAP9, LST1) reside in previous TG GWAS regions, suggesting them as the regional candidates underlying the GWAS signals.
This study presents a novel adipose gene co-expression network with 34 genes significantly correlated with serum TG across populations.
Mexicans; Finns; RNA sequencing; Triglycerides; Adipose tissue; Weighted gene co-expression network analysis
While it is apparent that rare variation can play an important role in the genetic architecture of autism spectrum disorders (ASDs), the contribution of common variation to the risk of developing ASD is less clear. To produce a more comprehensive picture, we report Stage 2 of the Autism Genome Project genome-wide association study, adding 1301 ASD families and bringing the total to 2705 families analysed (Stages 1 and 2). In addition to evaluating the association of individual single nucleotide polymorphisms (SNPs), we also sought evidence that common variants, en masse, might affect the risk. Despite genotyping over a million SNPs covering the genome, no single SNP shows significant association with ASD or selected phenotypes at a genome-wide level. The SNP that achieves the smallest P-value from secondary analyses is rs1718101. It falls in CNTNAP2, a gene previously implicated in susceptibility for ASD. This SNP also shows modest association with age of word/phrase acquisition in ASD subjects, of interest because features of language development are also associated with other variation in CNTNAP2. In contrast, allele scores derived from the transmission of common alleles to Stage 1 cases significantly predict case status in the independent Stage 2 sample. Despite being significant, the variance explained by these allele scores was small (Vm< 1%). Based on results from individual SNPs and their en masse effect on risk, as inferred from the allele score results, it is reasonable to conclude that common variants affect the risk for ASD but their individual effects are modest.
The molecular mechanisms underlying the changes in the nigrostriatal pathway in Parkinson’s disease (PD) are not completely understood. Here, we use mass spectrometry and microarrays to study the proteomic and transcriptomic changes in the striatum of two mouse models of PD, induced by the distinct neurotoxins 1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine (MPTP) and methamphetamine (METH). Proteomic analyses resulted in the identification and relative quantification of 912 proteins with two or more unique peptides and 86 proteins with significant abundance changes following neurotoxin treatment. Similarly, microarray analyses revealed 181 genes with significant changes in mRNA, following neurotoxin treatment. The combined protein and gene list provides a clearer picture of the potential mechanisms underlying neurodegeneration observed in PD. Functional analysis of this combined list revealed a number of significant categories, including mitochondrial dysfunction, oxidative stress response, and apoptosis. These results constitute one of the largest descriptive data sets integrating protein and transcript changes for these neurotoxin models with many similar end point phenotypes but distinct mechanisms.
Parkinson’s disease; transcriptomics; proteomics; codon usage; miRNA; mouse model
This study was undertaken to determine whether there is familial aggregation of Hyperemesis Gravidarum making it a disease amenable to genetic study.
Cases with severe nausea and vomiting in a singleton pregnancy treated with intravenous hydration and unaffected friend controls completed a survey regarding family history.
Sisters of women with Hyperemesis Gravidarum have a significantly increased risk of having Hyperemesis Gravidarum themselves (OR=17.3, p=0.005). Cases have a significantly increased risk of having a mother with severe nausea and vomiting; 33% of cases reported an affected mother compared to 7.7% of controls (p<.0001). Cases reported a similar frequency of affected second-degree maternal and paternal relatives (18% maternal lineage, 23% paternal lineage).
There is familial aggregation of Hyperemesis Gravidarum. This study provides strong evidence for a genetic component to hyperemesis gravidarum. Identification of the predisposing gene(s) may determine the cause of this poorly understood disease of pregnancy.
Familial Aggregation; Genetic; Hyperemesis Gravidarum; Nausea; Pregnancy
We summarize the work done by the contributors to Group 13 at Genetic Analysis Workshop 17 (GAW17) and provide a synthesis of their data analyses. The Group 13 contributors used a variety of approaches to test associations of both rare variants and common single-nucleotide polymorphisms (SNPs) with the GAW17 simulated traits, implementing analytic methods that incorporate multiallelic genotypes and haplotypes. In addition to using a wide variety of statistical methods and approaches, the contributors exhibited a remarkable amount of flexibility and creativity in coding the variants and their genes and in evaluating their proposed approaches and methods. We describe and contrast their methods along three dimensions: (1) selection and coding of genetic entities for analysis, (2) method of analysis, and (3) evaluation of the results. The contributors consistently presented a strong rationale for using multiallelic analytic approaches. They indicated that power was likely to be increased by capturing the signals of multiple markers within genetic entities defined by sliding windows, haplotypes, genes, functional pathways, and the entire set of SNPs and rare variants taken in aggregate. Despite this variability, the methods were fairly consistent in their ability to identify two associated genes for each simulated trait. The first gene was selected for the largest number of causal alleles and the second for a high-frequency causal SNP. The presumed model of inheritance and choice of genetic entities are likely to have a strong effect on the outcomes of the analyses.
rare variants; sequence data; multiallelic data; Bayesian regression; penalized regression; tree-based clustering; pathway analysis; haplotypes