The gene encoding the dopamine transporter (DAT) has been implicated in CNS disorders, but the responsible polymorphisms remain uncertain. To search for regulatory polymorphisms, we measured allelic DAT mRNA expression in substantia nigra of human autopsy brain tissues, using two marker SNPs (rs6347 in exon 9 and rs27072 in the 3′-UTR). Allelic mRNA expression imbalance (AEI), an indicator of cis-acting regulatory polymorphisms, was observed in all tissues heterozygous for either of the two marker SNPs. SNP scanning of the DAT locus with AEI ratios as the phenotype, followed by in vitro molecular genetics studies, demonstrated that rs27072 C>T affects mRNA expression and translation. Expression of the minor T allele was dynamically regulated in transfected cell cultures, possibly involving microRNA interactions. Both rs6347 and rs3836790 (intron8 5/6 VNTR) also seemed to affect DAT expression, but not the commonly tested 9/10 VNTR in the 3′UTR (rs28363170). All four polymorphisms (rs6347, intron8 5/6 VNTR, rs27072 and 3′UTR 9/10 VNTR) were genotyped in clinical cohorts, representing schizophrenia, bipolar disorder, depression, and controls. Only rs27072 was significantly associated with bipolar disorder (OR=2.1, p=0.03). This result was replicated in a second bipolar/control population (OR=1.65, p=0.01), supporting a critical role for DAT regulation in bipolar disorder.
dopamine transporter; bipolar disorder; allelic expression imbalance; SLC6A3; rs27072; Dopamine; Depression; Unipolar/Bipolar; Pharmacogenetics/Pharmacogenomics; Neurogenetics; Allelic expression imbalance; Dopamine transporter; SLC6A3
The use of pharmacogenomic biomarkers can enhance treatment outcomes. Regulatory polymorphisms are promising biomarkers that have proven difficult to uncover. They come in two flavors: those that affect transcription (regulatory single-nucleotide polymorphisms (rSNPs)) and those that affect RNA functions such as splicing, turnover, and translation (termed structural RNA SNPs (srSNPs)). This review focuses on the role of srSNPs in drug metabolism, transport, and response. An understanding of the nature and diversity of srSNPs and rSNPs enables clinical scientists to evaluate genetic biomarkers.
Polymorphisms in the gene encoding SORL1, involved in cellular trafficking of APP, have been implicated in late-onset Alzheimer’s disease, by a mechanism thought to affect mRNA expression. To search for regulatory polymorphisms, we have measured allele-specific mRNA expression of SORL1 in human autopsy tissues from the prefrontal cortex of 26 Alzheimer’s patients, and 51 controls, using two synonymous marker SNPs (rs3824968 in exon 34 (11 heterozygous AD subjects and 16 controls), and rs12364988 in exon 6 (8 heterozygous AD subjects)). Significant allelic expression imbalance (AEI), indicative of the presence of cis-acting regulatory factors, was detected in a single control subject, while allelic ratios were near unity for all other subjects. We genotyped 7 SNPs in two haplotype blocks that had previously been implicated in Alzheimer’s disease. Since each of these SNPs was heterozygous in several subjects lacking AEI, this study fails to support a regulatory role for SORL1 polymorphisms in mRNA expression.
Alzheimer’s disease; SORL1; Allelic expression imbalance
Genetic variation in C-type lectins influences infectious disease susceptibility but remains poorly understood. We employed allelic mRNA expression imbalance (AEI) technology for SP-A1, SP-A2, SP-D, DC-SIGN, MRC1, and Dectin-1, expressed in human macrophages and/or lung tissues. Frequent AEI, an indicator of regulatory polymorphisms, was observed in SP-A2, SP-D, and DC-SIGN. AEI was measured for SP-A2 in 38 lung tissues using four marker SNPs and was confirmed by next generation sequencing of one lung RNA sample. Genomic DNA at the SP-A2 DNA locus was sequenced by Ion Torrent technology in 16 samples. Correlation analysis of genotypes with AEI identified a haplotype block, and, specifically, the intronic SNP rs1650232 (30% MAF); the only variant consistently associated with an approximately two-fold change in mRNA allelic expression. Previously shown to alter a NAGNAG splice acceptor site with likely effects on SP-A2 expression, rs1650232 generates an alternative splice variant with three additional bases at the start of exon 3. Validated as a regulatory variant, rs1650232 is in partial LD with known SP-A2 marker SNPs previously associated with risk for respiratory diseases including tuberculosis. Applying functional DNA variants in clinical association studies, rather than marker SNPs, will advance our understanding of genetic susceptibility to infectious diseases.
C-type lectin; marker SNP; regulatory variant; SP-A2; allelic expression imbalance
Genome-wide association studies (GWAS) have identified numerous loci associated with various complex traits for which the underlying susceptibility gene(s) remain unknown. In a GWAS for high-density lipoprotein-cholesterol (HDL-C) level, one strongly associated locus contains at least two biologically compelling candidates, methylmalonic aciduria cblB type (MMAB) and mevalonate kinase (MVK). To detect evidence of cis-acting regulation at this locus, we measured relative allelic expression of transcribed SNPs in five genes using human hepatocyte samples heterozygous for the transcribed SNP. If an HDL-C-associated SNP allele differentially regulates mRNA level in cis, samples heterozygous both for a transcribed SNP and an HDL-C-associated SNP should display allelic expression imbalance (AEI) of the transcribed SNP. We designed statistical tests to detect AEI in a comprehensive set of linkage disequilibrium (LD) scenarios between the transcribed SNP and an HDL-C-associated SNP (rs7298565) in phase unknown samples. We observed significant AEI of 22% in MMAB (P = 1.4 × 10−13, transcribed SNP rs11067231), and the allele associated with lower HDL-C level was associated with greater MMAB transcript level. The same rs7298565 allele was also associated with higher MMAB mRNA level (P = 0.0081) and higher MMAB protein level (P = 0.0020). In contrast, MVK, UBE3B, KCTD10 and ACACB did not show significant AEI (P ≥ 0.05). These data suggest MMAB is the most likely gene influencing HDL-C levels at this locus and demonstrate that measuring AEI at loci containing more than one candidate gene can prioritize genes for functional studies.
Genetic variants of ACE are suspected risk factors in cardiovascular disease, but the alleles responsible for the variations remain unidentified. To search for regulatory polymorphisms, allelic angiotensin I–converting enzyme (ACE) mRNA expression was measured in 65 heart tissues, followed by genotype scanning of the ACE locus. Marked allelic expression imbalance (AEI) detected in five African-American subjects was associated with single-nucleotide polymorphisms (SNPs) (rs7213516, rs7214530, and rs4290) residing in conserved regions 2−3 kb upstream of ACE. Moreover, each of the SNPs affected transcription in reporter gene assays. SNPs rs4290 and rs7213516 were tested for associations with adverse cardiovascular outcomes in hypertensive patients with coronary disease (International Verapamil SR Trandolapril Study Genetic Substudy (INVEST-GENES), n = 1,032). Both SNPs were associated with adverse cardiovascular outcomes, largely attributable to nonfatal myocardial infarction in African Americans, showing an odds ratio of 6.16 (2.43−15.60) (P < 0.0001) for rs7213516. The high allele frequency in African Americans (16%) compared to Hispanics (4%) and Caucasians (<1%) suggests that these alleles contribute to variation between populations in cardiovascular risk and treatment outcomes.
Cis-acting genetic variations can affect the amount and structure of mRNA/protein. Genomic surveys indicate that polymorphisms affecting transcription and mRNA processing, including splicing and turnover, may account for main share of genetic factors in human phenotypic variability; however, most of these polymorphisms remain yet to be discovered. We use allelic expression imbalance (AEI) as a quantitative phenotype in the search for functionalcis-acting polymorphisms in many genes includingABCB1 (multidrug resistance 1 gene, MDR1, Pgp). Previous studies have shown that ABCB1 activity correlates with a synonymous polymorphism. C3435T; however, the functional polymorphism and molecular mechanisms underlying this clinical association remained unknown. Analysis of allele-specific expression in liver autopsy samples and in vitro expression experiments showed that C3435T represents a main functional polymorphism, accounting for 1.5-to 2-fold changes in mRNA levels. The mechanism appears to involve increased mRNA turnover, probably as a result of different folding structures calculated for mRNA with the Mfold program. Other examples of the successful application of AEI analysis for studying functional polymorphism include5-HTT (serotonin transporter, SLC6A4) andOPRM1 (μ opioid receptor). AEI is therefore a powerful approach for detectingcis-acting polymorphisms affecting gene expression and mRNA processing.
ABCB1; allele-specific expression; mRNA stability; cis-acting polymorphism
Common genetic variants that regulate gene expression are widely suspected to contribute to the etiology and phenotypic variability of complex diseases. Although high-throughput, microarray-based assays have been developed to measure differences in mRNA expression among independent samples, these assays often lack the sensitivity to detect rare mRNAs and the reproducibility to quantify small changes in mRNA expression. By contrast, PCR-based allelic expression imbalance (AEI) assays, which use a "marker" single nucleotide polymorphism (mSNP) in the mRNA to distinguish expression from pairs of genetic alleles in individual samples, have high sensitivity and accuracy, allowing differences in mRNA expression greater than 1.2-fold to be quantified with high reproducibility. In this paper, we describe the use of an efficient PCR/next-generation DNA sequencing-based assay to analyze allele-specific differences in mRNA expression for candidate neuropsychiatric disorder genes in human brain.
Using our assay, we successfully analyzed AEI for 70 candidate neuropsychiatric disorder genes in 52 independent human brain samples. Among these genes, 62/70 (89%) showed AEI ratios greater than 1 ± 0.2 in at least one sample and 8/70 (11%) showed no AEI. Arranging log2AEI ratios in increasing order from negative-to-positive values revealed highly reproducible distributions of log2AEI ratios that are distinct for each gene/marker SNP combination. Mathematical modeling suggests that these log2AEI distributions can provide important clues concerning the number, location and contributions of cis-acting regulatory variants to mRNA expression.
We have developed a highly sensitive and reproducible method for quantifying AEI of mRNA expressed in human brain. Importantly, this assay allowed quantification of differential mRNA expression for many candidate disease genes entirely missed in previously published microarray-based studies of mRNA expression in human brain. Given the ability of next-generation sequencing technology to generate large numbers of independent sequencing reads, our method should be suitable for analyzing from 100- to 200-candidate genes in 100 samples in a single experiment. We believe that this is the appropriate scale for investigating variation in mRNA expression for defined sets candidate disorder genes, allowing, for example, comprehensive coverage of genes that function within biological pathways implicated in specific disorders. The combination of AEI measurements and mathematical modeling described in this study can assist in identifying SNPs that correlate with mRNA expression. Alleles of these SNPs (individually or as sets) that accurately predict high- or low-mRNA expression should be useful as markers in genetic association studies aimed at linking candidate genes to specific neuropsychiatric disorders.
Cis-acting polymorphisms that affect gene expression are now known to be frequent, although the extent and mechanisms by which such variation affects the human phenotype are, as yet, only poorly understood. Key signatures of cis-acting variation are differences in gene expression that are tightly associated with regulatory SNPs or expression Quantitative Trait Loci (eQTL) and an imbalance of allelic expression (AEI) in heterozygous samples. Such cis-acting sequence differences appear often to have been under selection within and between populations and are also thought to be important in speciation. Here we describe the example of lactase persistence. In medical research, variants that affect regulation in cis have been implicated in both monogenic and polygenic disorders, and in the metabolism of drugs. In this review we suggest that by further understanding common regulatory variations and how they interact with other genetic and environmental variables it will be possible to gain insight into important mechanisms behind complex disease, with the potential to lead to new methods of diagnosis and treatments.
Cis-acting polymorphism; Gene expression; Regulation; Phenotypic variability; Allelic expression; Soft selective sweeps
Using the relative expression levels of two SNP alleles of a gene in the same sample is an effective approach for identifying cis-acting regulatory SNPs (rSNPs). In the current study, we established a process for systematic screening for cis-acting rSNPs using experimental detection of AI as an initial approach. We selected 160 expressed candidate genes that are involved in cancer and anticancer drug resistance for analysis of AI in a panel of cell lines that represent different types of cancers and have been well characterized for their response patterns against anticancer drugs. Of these genes, 60 contained heterozygous SNPs in their coding regions, and 41 of the genes displayed imbalanced expression of the two cSNP alleles. Genes that displayed AI were subjected to bioinformatics-assisted identification of rSNPs that alter the strength of transcription factor binding. rSNPs in 15 genes were subjected to electrophoretic mobility shift assay, and in eight of these genes (APC, BCL2, CCND2, MLH1, PARP1, SLIT2, YES1, XRCC1) we identified differential protein binding from a nuclear extract between the SNP alleles. The screening process allowed us to zoom in from 160 candidate genes to eight genes that may contain functional rSNPs in their promoter regions.
Motivation: Determining the functional impact of non-coding disease-associated single nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) is challenging. Many of these SNPs are likely to be regulatory SNPs (rSNPs): variations which affect the ability of a transcription factor (TF) to bind to DNA. However, experimental procedures for identifying rSNPs are expensive and labour intensive. Therefore, in silico methods are required for rSNP prediction. By scoring two alleles with a TF position weight matrix (PWM), it can be determined which SNPs are likely rSNPs. However, predictions in this manner are noisy and no method exists that determines the statistical significance of a nucleotide variation on a PWM score.
Results: We have designed an algorithm for in silico rSNP detection called is-rSNP. We employ novel convolution methods to determine the complete distributions of PWM scores and ratios between allele scores, facilitating assignment of statistical significance to rSNP effects. We have tested our method on 41 experimentally verified rSNPs, correctly predicting the disrupted TF in 28 cases. We also analysed 146 disease-associated SNPs with no known functional impact in an attempt to identify candidate rSNPs. Of the 11 significantly predicted disrupted TFs, 9 had previous evidence of being associated with the disease in the literature. These results demonstrate that is-rSNP is suitable for high-throughput screening of SNPs for potential regulatory function. This is a useful and important tool in the interpretation of GWAS.
Availability: is-rSNP software is available for use at: www.genomics.csse.unimelb.edu.au/is-rSNP
Contact: firstname.lastname@example.org; email@example.com
Supplementary information: Supplementary data are available at Bioinformatics online.
Acetyl Coenzyme A carboxylase β (ACACB) is the rate-limiting enzyme in fatty acid oxidation, and continuous fatty acid oxidation in Acacb knock-out mice increases insulin sensitivity. Systematic human studies have not been performed to evaluate whether ACACB variants regulate gene expression and insulin sensitivity in skeletal muscle and adipose tissues. We sought to determine whether ACACB transcribed variants were associated with ACACB gene expression and insulin sensitivity in non-diabetic African American (AA) and European American (EA) adults.
ACACB transcribed single nucleotide polymorphisms (SNPs) were genotyped in 105 EAs and 46 AAs whose body mass index (BMI), lipid profiles and ACACB gene expression in subcutaneous adipose and skeletal muscle had been measured. Allelic expression imbalance (AEI) was assessed in lymphoblast cell lines from heterozygous subjects in an additional EA sample (n = 95). Selected SNPs were further examined for association with insulin sensitivity in a cohort of 417 EAs and 153 AAs.
ACACB transcribed SNP rs2075260 (A/G) was associated with adipose ACACB messenger RNA expression in EAs and AAs (p = 3.8×10−5, dominant model in meta-analysis, Stouffer method), with the (A) allele representing lower gene expression in adipose and higher insulin sensitivity in EAs (p = 0.04). In EAs, adipose ACACB expression was negatively associated with age and sex-adjusted BMI (r = −0.35, p = 0.0002).
Common variants within the ACACB locus appear to regulate adipose gene expression in humans. Body fat (represented by BMI) may further regulate adipose ACACB gene expression in the EA population.
The single nucleotide polymorphism (SNP) rs2615977 is associated with osteoarthritis (OA) and is located in intron 31 of COL11A1, a strong candidate gene for this degenerative musculoskeletal disease. Furthermore, the common non-synonymous COL11A1 SNP rs1676486 is associated with another degenerative musculoskeletal disease, lumbar disc herniation (LDH). rs1676486 is a C-T transition mediating its affect on LDH susceptibility by modulating COL11A1 expression. The risk T-allele of rs1676486 leads to reduced expression of the COL11A1 transcript, a phenomenon known as allelic expression imbalance (AEI). We were keen therefore to assess whether the effect that rs1676486 has on COL11A1 expression in LDH is also observed in OA and whether the rs2615977 association to OA also marked AEI.
Using RNA from OA cartilage, we assessed whether either SNP correlated with COL11A1 AEI by 1) measuring COL11A1 expression and stratifying the data by genotype at each SNP; and 2) quantifying the mRNA transcribed from each allele of the two SNPs. We also assessed whether rs1676486 was associated with OA susceptibility using a case–control cohort of over 18,000 individuals.
We observed significant AEI at rs1676486 (p < 0.0001) with the T-allele correlating with reduced COL11A1 expression. This corresponded with observations in LDH but the SNP was not associated with OA. We did not observe AEI at rs2615977.
COL11A1 is subject to AEI in OA cartilage. AEI at rs1676486 is a risk factor for LDH, but not for OA. These two diseases therefore share a common functional phenotype, namely AEI of COL11A1, but this appears to be a disease risk only in LDH. Other functional effects on COL11A1 presumably account for the OA susceptibility that maps to this gene.
Osteoarthritis; Lumbar disc herniation; Genetics; Susceptibility; COL11A1; Allelic expression
Measuring allelic RNA expression ratios is a powerful approach for detecting cis-acting regulatory variants, RNA editing, loss of heterozygosity in cancer, copy number variation, and allele-specific epigenetic gene silencing. Whole transcriptome RNA sequencing (RNA-Seq) has emerged as a genome-wide tool for identifying allelic expression imbalance (AEI), but numerous factors bias allelic RNA ratio measurements. Here, we compare RNA-Seq allelic ratios measured in nine different human brain regions with a highly sensitive and accurate SNaPshot measure of allelic RNA ratios, identifying factors affecting reliable allelic ratio measurement. Accounting for these factors, we subsequently surveyed the variability of RNA editing across brain regions and across individuals.
We find that RNA-Seq allelic ratios from standard alignment methods correlate poorly with SNaPshot, but applying alternative alignment strategies and correcting for observed biases significantly improves correlations. Deploying these methods on a transcriptome-wide basis in nine brain regions from a single individual, we identified genes with AEI across all regions (SLC1A3, NHP2L1) and many others with region-specific AEI. In dorsolateral prefrontal cortex (DLPFC) tissues from 14 individuals, we found evidence for frequent regulatory variants affecting RNA expression in tens to hundreds of genes, depending on stringency for assigning AEI. Further, we find that the extent and variability of RNA editing is similar across brain regions and across individuals.
These results identify critical factors affecting allelic ratios measured by RNA-Seq and provide a foundation for using this technology to screen allelic RNA expression on a transcriptome-wide basis. Using this technology as a screening tool reveals tens to hundreds of genes harboring frequent functional variants affecting RNA expression in the human brain. With respect to RNA editing, the similarities within and between individuals leads us to conclude that this post-transcriptional process is under heavy regulatory influence to maintain an optimal degree of editing for normal biological function.
RNA-Seq; Whole transcriptome; Allele expression; mRNA expression; Functional genetics; Regulatory polymorphism; eQTL; Read alignment; Next generation sequencing; Bioinformatics
CC chemokine ligand 2 (CCL2) is the most potent monocyte chemoattractant and inter-individual differences in its expression level have been associated with genetic variants mapping to the cis-regulatory regions of the gene. An A to G polymorphism in the CCL2 enhancer region at position –2578 (rs1024611; A>G), was found in most studies to be associated with higher serum CCL2 levels and increased susceptibility to a variety of diseases such as HIV-1 associated neurological disorders, tuberculosis, and atherosclerosis. However, the precise mechanism by which rs1024611influences CCL2 expression is not known. To address this knowledge gap, we tested the hypothesis that rs1024611G polymorphism is associated with allelic expression imbalance (AEI) of CCL2. We used haplotype analysis and identified a transcribed SNP in the 3′UTR (rs13900; C>T) can serve as a proxy for the rs1024611 and demonstrated that the rs1024611G allele displayed a perfect linkage disequilibrium with rs13900T allele. Allele-specific transcript quantification in lipopolysaccharide treated PBMCs obtained from heterozygous donors showed that rs13900T allele were expressed at higher levels when compared to rs13900C allele in all the donors examined suggesting that CCL2 is subjected to AEI and that that the allele containing rs1024611G is preferentially transcribed. We also found that AEI of CCL2 is a stable trait and could be detected in newly synthesized RNA. In contrast to these in vivo findings, in vitro assays with haplotype-specific reporter constructs indicated that the haplotype bearing rs1024611G had a lower or similar transcriptional activity when compared to the haplotype containing rs1024611A. This discordance between the in vivo and in vitro expression studies suggests that the CCL2 regulatory region polymorphisms may be functioning in a complex and context-dependent manner. In summary, our studies provide strong functional evidence and a rational explanation for the phenotypic effects of the CCL2 rs1024611G allele.
Single nucleotide polymorphisms (SNPs) in putative microRNA binding sites (miRSNPs) modulate cancer susceptibility via affecting miRNA binding. Here, we sought to investigate the association between miRSNPs and cervical cancer risk.
We first genotyped 41 miRSNPs of 37 cancer-related genes in 338 patients and 334 controls (Study 1), and replicated the significant associations in 502 patients and 600 controls (Study 2). We tested the effects of miRSNPs on microRNA-mRNA interaction by luciferase reporter assay.
Five SNPs displayed notable association with cervical cancer risk in Study 1. Only IL-16 rs1131445 maintained a significant association with cervical cancer (CT/CC vs. TT, adjusted OR = 1.51, P = 0.001) in Study 2. This association was more evident in the combined data of two studies (adjusted OR = 1.49, P = 0.00007). We also found that miR-135b mimics interacted with IL-16 3′-UTR to reduce gene expression and that the rs1131445 T to C substitution within the putative binding site impaired the interaction of miR-135b with IL-16 3′-UTR. An ELISA indicated that the serum IL-16 of patients with cervical cancer was elevated (vs. controls, P = 0.001) and correlated with the rs1131445 genotype. Patients who carried the rs1131445 C allele had higher serum IL-16 than non-carriers (P<0.001).
These results support our hypothesis that miRSNPs constitute a susceptibility factor for cervical cancers. rs1131445 affects IL-16 expression by interfering with the suppressive function of miR135b and this variant is significantly associated with cervical cancer risk.
The computational analysis of regulatory SNPs (rSNPs) is an essential step in the elucidation of the structure and function of regulatory networks at the cellular level. In this work we focus in particular on SNPs that potentially affect a Transcription Factor Binding Site (TFBS) to a significant extent, possibly resulting in changes to gene expression patterns or alternative splicing. The application described here is based on the MAPPER platform, a previously developed web-based system for the computational detection of TFBSs in DNA sequences.
rSNP-MAPPER is a computational tool that analyzes SNPs lying within predicted TFBSs and determines whether the allele substitution results in a significant change in the TFBS predictive score. The application's simple and intuitive interface supports several usage modes. For example, the user may search for potential rSNPs in the promoters of one or more genes, specified as a list of identifiers or chosen among the members of a pathway. Alternatively, the user may specify a set of SNPs to be analyzed by uploading a list of SNP identifiers or providing the coordinates of a genomic region. Finally, the user can provide two alternative sequences (wildtype and mutant), and the system will determine the location of variants to be analyzed by comparing them.
In this paper we outline the architecture of rSNP-MAPPER, describing its intuitive and powerful user interface in detail. We then present several examples of the use of rSNP-MAPPER to reproduce and confirm experimental studies aimed at identifying regulatory SNPs in human genes, that show how rSNP-MAPPER is able to detect and characterize rSNPs with high accuracy. Results are richly annotated and can be displayed online or downloaded in a number of different formats.
rSNP-MAPPER is optimized for large scale work, allowing for the efficient annotation of thousands of SNPs, and is designed to assist in the genome-wide investigation of transcriptional regulatory networks, prioritizing potential rSNPs for subsequent experimental validation. rSNP-MAPPER is freely available at http://genome.ufl.edu/mapper/.
Besides their use in mRNA expression profiling, oligonucleotide microarrays have also been applied to single-nucleotide polymorphism (SNP) and loss of heterozygosity (LOH) or allelic imbalance studies. In this report, we evaluate the reliability of using whole genome amplified DNA for analysis with an oligonucleotide microarray containing 11 560 SNPs to detect allelic imbalance and chromosomal copy number abnormalities. Whole genome SNP analyses were performed with DNA extracted from osteosarcoma tissues and patient-matched blood. SNP calls were then generated by Affymetrix® GeneChip® DNA Analysis Software. In two osteosarcoma cases, using unamplified DNA, we identified 793 and 1070 SNP loci with allelic imbalance, respectively. In a parallel experiment with amplified DNA, 78% and 83% of these SNP loci with allelic imbalance was detected. The average false-positive rate is 13.8%. Furthermore, using the Affymetrix® GeneChip® Chromosome Copy Number Tool to analyze the SNP array data, we were able to detect identical chromosomal regions with gain or loss in both amplified and unamplified DNA at cytoband resolution.
A common single-nucleotide polymorphism (SNP) in the promoter of the Connexin-40 (Cx40) gene GJA5 was suggested to affect Cx40 promoter activity and the risk of atrial fibrillation (AF), but the role of other common Cx40 polymorphisms is unknown.
Methods and Results
8 SNPs within the Cx40 gene region were tested for association with Cx40 levels measured in atrial tissue from 61 individuals. The previously described Cx40 promoter SNP (rs35594137, −44G→ A) was not associated with Cx40 mRNA levels. However, a common SNP (rs10465885) located in the TATA box of an alternative Cx40 promoter was strongly associated with Cx40 mRNA expression (P<0.0001) and displayed strong and consistent allelic expression imbalance in human atrial tissue. A promoter-luciferase assay in cultured murine cardiomyocytes demonstrated reduced activity of the promoter containing the minor allele of this SNP (P<0.0001). Both rs35594137 and rs10465885 were tested for association with early onset lone AF (≤60 years of age) in 384 cases and 3010 population controls. rs10465885 was associated with the AF phenotype (OR=1.18, P=0.046). This result was confirmed in a meta-analysis including two additional early onset lone AF case-control cohorts (OR=1.16, P=0.022). rs35594137 was not associated with the lone AF phenotype in any of the cohorts studied or in a combined analysis.
A previously described Cx40 promoter SNP was not found to influence Cx40 expression or risk of AF. We describe an alternate promoter polymorphism that directly affects levels of Cx40 mRNA in vivo and is associated with early onset lone AF.
atrial fibrillation; ion channels; genetics; allelic expression imbalance
Each of the human genes or transcriptional units is likely to contain single nucleotide polymorphisms that may give rise to sequence variation between individuals and tissues on the level of RNA. Based on recent studies, differential expression of the two alleles of heterozygous coding single nucleotide polymorphisms (SNPs) may be frequent for human genes. Methods with high accuracy to be used in a high throughput setting are needed for systematic surveys of expressed sequence variation. In this study we evaluated two formats of multiplexed, microarray based minisequencing for quantitative detection of imbalanced expression of SNP alleles. We used a panel of ten SNPs located in five genes known to be expressed in two endothelial cell lines as our model system.
The accuracy and sensitivity of quantitative detection of allelic imbalance was assessed for each SNP by constructing regression lines using a dilution series of mixed samples from individuals of different genotype. Accurate quantification of SNP alleles by both assay formats was evidenced for by R2 values > 0.95 for the majority of the regression lines. According to a two sample t-test, we were able to distinguish 1–9% of a minority SNP allele from a homozygous genotype, with larger variation between SNPs than between assay formats. Six of the SNPs, heterozygous in either of the two cell lines, were genotyped in RNA extracted from the endothelial cells. The coefficient of variation between the fluorescent signals from five parallel reactions was similar for cDNA and genomic DNA. The fluorescence signal intensity ratios measured in the cDNA samples were compared to those in genomic DNA to determine the relative expression levels of the two alleles of each SNP. Four of the six SNPs tested displayed a higher than 1.4-fold difference in allelic ratios between cDNA and genomic DNA. The results were verified by allele-specific oligonucleotide hybridisation and minisequencing in a microtiter plate format.
We conclude that microarray based minisequencing is an accurate and accessible tool for multiplexed screening for imbalanced allelic expression in multiple samples and tissues in parallel.
Essential hypertension is a complex disorder, caused by the interplay between many genetic variants, gene-gene interactions, and environmental factors. Given that the renin-angiotensin system (RAS) plays an important role in blood pressure (BP) control, cardiovascular regulation, and cardiovascular remodeling, special attention has been devoted to the investigation of single-nucleotide polymorphisms (SNP) harbored in RAS genes that may be associated with hypertension and cardiovascular disease. MicroRNAs (miRNAs) are a family of small, ∼21-nucleotide long, and nonprotein-coding RNAs that recognize target mRNAs through partial complementary elements in the 3′-untranslated region (3′-UTR) of mRNAs and inhibit gene expression by targeting mRNAs for translational repression or destabilization. Since miRNA SNPs (miRSNPs) can create, destroy, or modify miRNA binding sites, this review focuses on the hypothesis that transcribed target SNPs harbored in RAS mRNAs, that alter miRNA gene regulation and consequently protein expression, may contribute to cardiovascular disease susceptibility.
Disease-associated SNPs detected in large-scale association studies are frequently located in non-coding genomic regions, suggesting that they may be involved in transcriptional regulation. Here we describe a new strategy for detecting regulatory SNPs (rSNPs), by combining computational and experimental approaches. Whole genome ChIP-chip data for USF1 was analyzed using a novel motif finding algorithm called BCRANK. 1754 binding sites were identified and 140 candidate rSNPs were found in the predicted sites. For validating their regulatory function, seven SNPs found to be heterozygous in at least one of four human cell samples were investigated by ChIP and sequence analysis (haploChIP). In four of five cases where the SNP was predicted to affect binding, USF1 was preferentially bound to the allele containing the consensus motif. Allelic differences in binding for other proteins and histone marks further reinforced the SNPs regulatory potential. Moreover, for one of these SNPs, H3K36me3 and POLR2A levels at neighboring heterozygous SNPs indicated effects on transcription. Our strategy, which is entirely based on in vivo data for both the prediction and validation steps, can identify individual binding sites at base pair resolution and predict rSNPs. Overall, this approach can help to pinpoint the causative SNPs in complex disorders where the associated haplotypes are located in regulatory regions. Availability: BCRANK is available from Bioconductor (http://www.bioconductor.org/).
Next-Generation Sequencing (NGS) technologies are yielding ever-higher volumes of human genome sequence data. Given this large amount of data, it has become both a possibility and a priority to determine how disease-causing single nucleotide polymorphisms (SNPs) detected within gene regulatory regions (rSNPs) exert their effects on gene expression. Recently, several studies have explored whether disease-causing polymorphisms have attributes that can distinguish them from those that are neutral, attaining moderate success at discriminating between functional and putatively neutral regulatory SNPs. Here, we have extended this work by assessing the utility of both SNP-based features (those associated only with the polymorphism site and the surrounding DNA) and Gene-based features (those derived from the associated gene in whose regulatory region the SNP lies) in the identification of functional regulatory polymorphisms involved in either monogenic or complex disease. Gene-based features were found to be capable of both augmenting and enhancing the utility of SNP-based features in the prediction of known regulatory mutations. Adopting this approach, we achieved an AUC of 0.903 for predicting regulatory SNPs. Finally, our tool predicted 225 new regulatory SNPs with a high degree of confidence, with 105 of the 225 falling into linkage disequilibrium blocks of reported disease-associated GWAS SNPs.
Regulatory mutations; Machine learning; Monogenic disease; Complex disease; Single Nucleotide Polymorphisms; SNP
Allelic expression imbalance (AEI) is an important genetic factor being the cause of differences in phenotypic traits that can be heritable. Studying AEI can be useful in searching for factors that modulate gene expression and help to understand molecular mechanisms underlying phenotypic changes. Although it was commonly recognized in many species and we know many genes show allelic expression imbalance, this phenomena was not studied on a larger scale in cattle. Using the pyrosequencing method we analyzed a set of 29 bovine genes in order to find those that have preferential allelic expression. The study was conducted in three tissues: liver, pituitary and kindey. Out of the studied group of genes 3 of them—LEP (leptin), IGF2 (insulin-like growth factor 2), CCL2 (chemokine C–C motif ligand 2) showed allelic expression imbalance.
Electronic supplementary material
The online version of this article (doi:10.1007/s11033-012-2161-3) contains supplementary material, which is available to authorized users.
Allelic expression imbalance; Transcription regulation; Holstein–Friesian cattle; Gene expression
The most common form of genetic variation, single nucleotide polymorphisms or SNPs, can affect the way an individual responds to the environment and modify disease risk. Although most of the millions of SNPs have little or no effect on gene regulation and protein activity, there are many circumstances where base changes can have deleterious effects. Non-synonymous SNPs that result in amino acid changes in proteins have been studied because of their obvious impact on protein activity. It is well known that SNPs within regulatory regions of the genome can result in disregulation of gene transcription. However, the impact of SNPs located in putative regulatory regions, or rSNPs, is harder to predict for two primary reasons. First, the mechanistic roles of non-coding genomic sequence remain poorly defined. Second, experimental validation of the functional consequences of rSNPs is often slow and laborious. In this review, we summarize traditional and novel methodologies for candidate rSNPs selection, in particular in silico techniques that aid in candidate rSNP selection. Additionally we will discuss molecular biological techniques that assess the impact of rSNPs on binding of regulatory machinery, as well as functional consequences on transcription. Standard techniques such as EMSA and luciferase reporter constructs are still widely used to assess effects of rSNPs on binding and gene transcription; however, these protocols are often bottlenecks in the discovery process. Therefore, we highlight novel and developing high-throughput protocols that promise to aid in shortening the process of rSNP validation. Given the large amount of genomic information generated from a multitude of re-sequencing and genome-wide SNP array efforts, future focus should be to develop validation techniques that will allow greater understanding of the impact these polymorphisms have on human health and disease.
polymorphism; SNPs; gene regulation; functional genomics; microsphere assay