Measuring allelic RNA expression ratios is a powerful approach for detecting cis-acting regulatory variants, RNA editing, loss of heterozygosity in cancer, copy number variation, and allele-specific epigenetic gene silencing. Whole transcriptome RNA sequencing (RNA-Seq) has emerged as a genome-wide tool for identifying allelic expression imbalance (AEI), but numerous factors bias allelic RNA ratio measurements. Here, we compare RNA-Seq allelic ratios measured in nine different human brain regions with a highly sensitive and accurate SNaPshot measure of allelic RNA ratios, identifying factors affecting reliable allelic ratio measurement. Accounting for these factors, we subsequently surveyed the variability of RNA editing across brain regions and across individuals.
We find that RNA-Seq allelic ratios from standard alignment methods correlate poorly with SNaPshot, but applying alternative alignment strategies and correcting for observed biases significantly improves correlations. Deploying these methods on a transcriptome-wide basis in nine brain regions from a single individual, we identified genes with AEI across all regions (SLC1A3, NHP2L1) and many others with region-specific AEI. In dorsolateral prefrontal cortex (DLPFC) tissues from 14 individuals, we found evidence for frequent regulatory variants affecting RNA expression in tens to hundreds of genes, depending on stringency for assigning AEI. Further, we find that the extent and variability of RNA editing is similar across brain regions and across individuals.
These results identify critical factors affecting allelic ratios measured by RNA-Seq and provide a foundation for using this technology to screen allelic RNA expression on a transcriptome-wide basis. Using this technology as a screening tool reveals tens to hundreds of genes harboring frequent functional variants affecting RNA expression in the human brain. With respect to RNA editing, the similarities within and between individuals leads us to conclude that this post-transcriptional process is under heavy regulatory influence to maintain an optimal degree of editing for normal biological function.
RNA-Seq; Whole transcriptome; Allele expression; mRNA expression; Functional genetics; Regulatory polymorphism; eQTL; Read alignment; Next generation sequencing; Bioinformatics
The dopamine receptor D2 (encoded by DRD2) is implicated in susceptibility to mental disorders and cocaine abuse, but mechanisms responsible for this relationship remain uncertain. DRD2 mRNA exists in two main splice isoforms with distinct functions: D2 long (D2L) and D2 short (D2S, lacking exon 6), expressed mainly postsynaptically and presynaptically, respectively. Two intronic single-nucleotide polymorphisms (SNPs rs2283265 (intron 5) and rs1076560 (intron 6)) in high linkage disequilibrium (LD) with each other have been reported to alter D2S/D2L splicing and several behavioral traits in human subjects, such as memory processing. To assess the role of DRD2 variants in cocaine abuse, we measured levels of D2S and D2L mRNA in human brain autopsy tissues (prefrontal cortex and putamen) obtained from cocaine abusers and controls, and genotyped a panel of DRD2 SNPs (119 abusers and 95 controls). Robust effects of rs2283265 and rs1076560 on reducing formation of D2S relative to D2L were confirmed. The minor alleles of rs2283265/rs1076560 were considerably more frequent in Caucasians (18%) compared with African Americans (7%). Also, in Caucasians, rs2283265/rs1076560 minor alleles were significantly overrepresented in cocaine abusers compared with controls (rs2283265: 25 to 9%, respectively; p=0.001; OR=3.4 (1.7–7.1)). Several SNPs previously implicated in diverse clinical association studies are in high LD with rs2283265/rs1076560 and could have served as surrogate markers. Our results confirm the role of rs2283265/rs1076560 in D2 alternative splicing and support a strong role in susceptibility to cocaine abuse.
alternative splicing; cocaine; dopamine; DRD2; D2S; human; addiction and substance abuse; dopamine; neurogenetics; psychostimulants; drd2; d2s; human; alternative splicing; cocaine
CHRNA5, encoding the nicotinic α5 subunit, is implicated in multiple disorders, including nicotine addiction and lung cancer. Previous studies demonstrate significant associations between promoter polymorphisms and CHRNA5 mRNA expression, but the responsible sequence variants remain uncertain. To search for cis-regulatory variants, we measured allele-specific mRNA expression of CHRNA5 in human prefrontal cortex autopsy tissues and scanned the CHRNA5 locus for regulatory variants. A cluster of six frequent single nucleotide polymorphisms (rs1979905, rs1979906, rs1979907, rs880395, rs905740, and rs7164030), in complete linkage disequilibrium, fully account for a >2.5-fold allelic expression difference and a fourfold increase in overall CHRNA5 mRNA expression. This proposed enhancer region resides more than 13 kilobases upstream of the CHRNA5 transcription start site. The same upstream variants failed to affect CHRNA5 mRNA expression in peripheral blood lymphocytes, indicating tissue-specific gene regulation. Other promoter polymorphisms were also correlated with overall CHRNA5 mRNA expression in the brain, but were inconsistent with allelic mRNA expression ratios, a robust and proximate measure of cis-regulatory variants. The enhancer region and the nonsynonymous polymorphism rs16969968 generate three main haplotypes that alter the risk of developing nicotine dependence. Ethnic differences in linkage disequilibrium across the CHRNA5 locus require consideration of the upstream enhancer variants when testing clinical associations.
Nicotinic receptor; alpha5 subunit; gene expression; nicotine dependence; lung cancer; enhancer
CHRNA5, encoding the nicotinic α5 subunit, is implicated in multiple disorders, including nicotine addiction and lung cancer. Previous studies demonstrate significant associations between promoter polymorphisms and CHRNA5 mRNA expression, but the responsible sequence variants remain uncertain. To search for cis-regulatory variants, we measured allele-specific mRNA expression of CHRNA5 in human prefrontal cortex autopsy tissues and scanned the CHRNA5 locus for regulatory variants. A cluster of six frequent single-nucleotide polymorphisms (rs1979905, rs1979906, rs1979907, rs880395, rs905740, and rs7164030), in complete linkage disequilibrium (LD), fully account for a >2.5-fold allelic expression difference and a fourfold increase in overall CHRNA5 mRNA expression. This proposed enhancer region resides more than 13 kilobases upstream of the CHRNA5 transcription start site. The same upstream variants failed to affect CHRNA5 mRNA expression in peripheral blood lymphocytes, indicating tissue-specific gene regulation. Other promoter polymorphisms were also correlated with overall CHRNA5 mRNA expression in the brain, but were inconsistent with allelic mRNA expression ratios, a robust and proximate measure of cis-regulatory variants. The enhancer region and the nonsynonymous polymorphism rs16969968 generate three main haplotypes that alter the risk of developing nicotine dependence. Ethnic differences in LD across the CHRNA5 locus require consideration of upstream enhancer variants when testing clinical associations.
nicotinic receptor; α5 subunit; gene expression; nicotine dependence; lung cancer; enhancer
Several studies suggest that prenatal stress is a possible risk factor in the development of autism spectrum disorders. However, many children exposed to stress prenatally are born healthy and develop typically, suggesting that other factors must contribute to autism. Genes that contribute to stress reactivity may, therefore, exacerbate prenatal stress-mediated behavioral changes in the adult offspring. One candidate gene linked to increased stress reactivity encodes the serotonin transporter. Specifically, an insertion/deletion (long/short allele) polymorphism upstream of the serotonin transporter gene correlates with differential expression and function of the serotonin transporter and a heightened response to stressors. Heterozygous serotonin transporter knockout mice show reductions in serotonin transporter expression similar to the human short polymorphism. In this study, the role of prenatal stress and maternal serotonin transporter genotype were assessed in mice to determine whether their combined effect produces reductions in social behavior in the adult offspring. Pregnant serotonin transporter heterozygous knockout and wild-type dams were placed in either a control condition or subjected to chronic variable stress. The adult offspring were subsequently assessed for social interaction and anxiety using a 3-chamber social approach task, ultrasonic vocalization detection, elevated-plus maze and an open field task. Results indicated that prenatal stress and reduced serotonin transporter expression of the dam may have the combined effect of producing changes in social interaction and social interest in the offspring consistent with those observed in autism spectrum disorder. This data indicates a possible combined effect of maternal serotonin transporter genotype and prenatal stress contributing to the production of autistic-like behaviors in offspring.
animal model; autism; knockout mouse; stress; prenatal stress; serotonin transporter
Interactions between presynaptic and postsynaptic cellular adhesion molecules (CAMs) drive synapse maturation during development. These trans-synaptic interactions are regulated by alternative splicing of CAM RNAs, which ultimately determines neurotransmitter phenotype. The diverse assortment of RNAs produced by alternative splicing generates countless protein isoforms necessary for guiding specialized cell-to-cell connectivity. Failure to generate the appropriate synaptic adhesion proteins is associated with disrupted glutamatergic and gamma-aminobutyric acid signaling, resulting in loss of activity-dependent neuronal plasticity, and risk for developmental disorders, including autism. While the majority of genetic mutations currently linked to autism are rare variants that change the protein-coding sequence of synaptic candidate genes, regulatory polymorphisms affecting constitutive and alternative splicing have emerged as risk factors in numerous other diseases, accounting for an estimated 40–60% of general disease risk. Here, we review the relationship between aberrant RNA splicing of synapse-related genes and autism spectrum disorders.
autism spectrum disorder; synaptogenesis; alternative RNA splicing; cellular adhesion molecules; neurexin; neuroligin; gene expression; neural development
The prevalence of obesity in children and adults in the United States has increased dramatically over the past decade. Besides environmental factors, genetic factors are known to play an important role in the pathogenesis of obesity. A number of genetic determinants of adult BMI have already been established through genome wide association studies. In this study, we examined 25 single nucleotide polymorphisms (SNPs) corresponding to thirteen previously reported genomic loci in 6,078 children with measures of BMI. Fifteen of these SNPs yielded at least nominally significant association to BMI, representing nine different loci including INSIG2, FTO, MC4R, TMEM18, GNPDA2, NEGR1, BDNF, KCTD15 and 1q25. Other loci revealed no evidence for association, namely at MTCH2, SH2B1, 12q13 and 3q27. For the 15 associated variants, the genotype score explained 1.12% of the total variation for BMI z-score. We conclude that among thirteen loci that have been reported to associate with adult BMI, at least nine also contribute to the determination of BMI in childhood as demonstrated by their associations in our pediatric cohort.
Recently a modest, but consistently, replicated association was demonstrated between obesity and the single nucleotide polymorphism (SNP), rs17782313, 3’ of the MC4R locus as a consequence of a meta-analysis of genome wide association (GWA) studies of the disease in Caucasian populations. We investigated the association in the context of the childhood form of the disease utilizing data from our ongoing GWA study in a cohort of 728 European American (EA) obese children (BMI ≥ 95th percentile) and 3,960 EA controls (BMI < 95th percentile), as well as 1,008 African American (AA) obese children and 2,715 AA controls. rs571312, rs10871777 and rs476828 (perfect surrogates for rs17782313) yielded odds ratios in the EA cohort of 1.142 (P = 0.045), 1.137 (P = 0.054) and 1.145 (P = 0.042); however, there was no significant association with these SNPs in the AA cohort. When investigating all thirty SNPs present on the Illumina BeadChip at this locus, again there was no evidence for association in AA cases when correcting for the number of tests employed. As such, variants 3’ to the MC4R locus present on the genotyping platform utilized confer a similar magnitude of risk of obesity in Caucasian children as to their adult Caucasian counterparts but this observation did not extend to African Americans.
OBJECTIVE—Two recent genome-wide association (GWA) studies have revealed novel loci for type 1 diabetes, a common multifactorial disease with a strong genetic component. To fully utilize the GWA data that we had obtained by genotyping 563 type 1 diabetes probands and 1,146 control subjects, as well as 483 case subject–parent trios, using the Illumina HumanHap550 BeadChip, we designed a full stage 2 study to capture other possible association signals.
RESEARCH DESIGN AND METHODS—From our existing datasets, we selected 982 markers with P < 0.05 in both GWA cohorts. Genotyping these in an independent set of 636 nuclear families with 974 affected offspring revealed 75 markers that also had P < 0.05 in this third cohort. Among these, six single nucleotide polymorphisms in five novel loci also had P < 0.05 in the Wellcome Trust Case-Control Consortium dataset and were further tested in 1,303 type 1 diabetes probands from the Diabetes Control and Complications Trial/Epidemiology of Diabetes Interventions and Complications (DCCT/EDIC) plus 1,673 control subjects.
RESULTS—Two markers (rs9976767 and rs3757247) remained significant after adjusting for the number of tests in this last cohort; they reside in UBASH3A (OR 1.16; combined P = 2.33 × 10−8) and BACH2 (1.13; combined P = 1.25 × 10−6).
CONCLUSIONS—Evaluation of a large number of statistical GWA candidates in several independent cohorts has revealed additional loci that are associated with type 1 diabetes. The two genes at these respective loci, UBASH3A and BACH2, are both biologically relevant to autoimmunity.
Recently an association was demonstrated between the single nucleotide polymorphism (SNP), rs10516487, within the B-cell gene BANK1 and systemic lupus erythematosus (SLE) as a consequence of a genome wide association study of this disease in European and Argentinean populations. In a bid for replication, we examined the effects of the R61H non-synonymous variant with respect to SLE in our genotyped American cohorts of European and African ancestry. Utilizing data from our ongoing genome-wide association study in our cohort of 178 Caucasian SLE cases and 1808 Caucasian population-based controls plus 148 African American (AA) SLE cases and 1894 AA population-based controls we investigated the association of the previously described non-synonymous SNP at the BANK1 locus with the disease in the two ethnicities separately. Using a Fisher’s exact test, the minor allele frequency (MAF) of rs10516487 in the Caucasian cases was 22.6% while it was 31.2% in Caucasian controls, yielding a protective odds ratio (OR) of 0.64 (95% CI 0.49-0.85; one-sided p = 7.07 × 10−4). Furthermore, the MAF of rs10516487 in the AA cases was 18.7% while it was 23.3% in AA controls, yielding a protective OR of 0.75 (95% CI 0.55–1.034; one-sided p = 0.039). The OR of the BANK1 variant in our study cohorts is highly comparable with that reported previously in a South American/European SLE case-control cohort (OR = 0.72). As such, R61H in the BANK1 gene confers a similar magnitude of SLE protection, not only in European Americans, but also in African Americans.
systemic lupus erythematosus; African Americans; European Americans; BANK1 gene
Inflammatory bowel disease (IBD) is a common inflammatory disorder with complex etiology that involves both genetic and environmental triggers, including but not limited to defects in bacterial clearance, defective mucosal barrier and persistent dysregulation of the immune response to commensal intestinal bacteria. IBD is characterized by two distinct phenotypes: Crohn’s disease (CD) and ulcerative colitis (UC). Previously reported GWA studies have identified genetic variation accounting for a small portion of the overall genetic susceptibility to CD and an even smaller contribution to UC pathogenesis. We hypothesized that stratification of IBD by age of onset might identify additional genes associated with IBD. To that end, we carried out a GWA analysis in a cohort of 1,011 individuals with pediatric-onset IBD and 4,250 matched controls. We identified and replicated significantly associated, previously unreported loci on chromosomes 20q13 (rs2315008[T] and rs4809330[A]; P = 6.30 × 10−8 and 6.95 × 10−8, respectively; odds ratio (OR) = 0.74 for both) and 21q22 (rs2836878[A]; P = 6.01 × 10−8; OR = 0.73), located close to the TNFRSF6B and PSMG1 genes, respectively.
Recently an association was demonstrated between the single nucleotide polymorphism (SNP), rs9939609, within the FTO locus and obesity as a consequence of a genome wide association (GWA) study of type 2 diabetes in adults. We examined the effects of two perfect surrogates for this SNP plus 11 other SNPs at this locus with respect to our childhood obesity cohort, consisting of both Caucasians and African Americans (AA). Utilizing data from our ongoing GWA study in our cohort of 418 Caucasian obese children (BMI≥95th percentile), 2,270 Caucasian controls (BMI<95th percentile), 578 AA obese children and 1,424 AA controls, we investigated the association of the previously reported variation at the FTO locus with the childhood form of this disease in both ethnicities. The minor allele frequencies (MAF) of rs8050136 and rs3751812 (perfect surrogates for rs9939609 i.e. both r2 = 1) in the Caucasian cases were 0.448 and 0.443 respectively while they were 0.391 and 0.386 in Caucasian controls respectively, yielding for both an odds ratio (OR) of 1.27 (95% CI 1.08–1.47; P = 0.0022). Furthermore, the MAFs of rs8050136 and rs3751812 in the AA cases were 0.449 and 0.115 respectively while they were 0.436 and 0.090 in AA controls respectively, yielding an OR of 1.05 (95% CI 0.91–1.21; P = 0.49) and of 1.31 (95% CI 1.050–1.643; P = 0.017) respectively. Investigating all 13 SNPs present on the Illumina HumanHap550 BeadChip in this region of linkage disequilibrium, rs3751812 was the only SNP conferring significant risk in AA. We have therefore replicated and refined the association in an AA cohort and distilled a tag-SNP, rs3751812, which captures the ancestral origin of the actual mutation. As such, variants in the FTO gene confer a similar magnitude of risk of obesity to children as to their adult counterparts and appear to have a global impact.