|Home | About | Journals | Submit | Contact Us | Français|
In Parkinson disease, the second most common neurodegenerative disorder in humans, increased alpha-synuclein (SNCA) levels are pathogenic, as evidenced by gene copy number mutations and increased alpha-synuclein levels detected in some familial and sporadic PD cases, respectively. Gene expression can be regulated at the post-transcriptional level by elements in the 3′ untranslated region (3′UTR) of mRNAs. The goal of this study was to determine whether the 3′UTR of human SNCA can affect gene expression. Comparative sequence analysis revealed very high conservation across the entire 3′UTR of human SNCA over millions of years, suggesting the presence of multiple functionally important domains. EST and RT-PCR analyses showed that four different polyadenylation events occur in the 3′UTR of human SNCA. Finally, using luciferase assays, we examined the effect of the minor allele of five naturally occurring single nucleotide polymorphisms (SNPs) in the 3′UTR of SNCA on gene expression. The minor allele of SNP rs17016074 increased luciferase expression by 32% in a transient transfection assay in SHSY5Y neuroblastoma cells. Understanding the role of the 3′UTR of human SNCA and identifying functionally important naturally occurring SNPs using reporter assays can complement disease association studies in humans, uncovering potential susceptibility or protective polymorphisms in Parkinson disease. Our findings demonstrate that the 3′UTR of human SNCA, as a whole, and rs17016074, in particular, are loci of potential clinical importance for Parkinson disease.
Parkinson disease (PD) is the second most common neurodegenerative disorder in humans. Approximately 90% of all PD cases are sporadic, with the remaining ~10% being familial. In the vast majority of PD cases, the exact etiology of the disease is not known . In addition to the point mutations that cause some cases of familial PD, two lines of evidence suggest that, in the absence of point mutations, increased levels of wild-type alpha-synuclein are also pathogenic. First, in some sporadic PD cases, SNCA mRNA levels in the brain are increased [3, 9, 21, 25]. Second, eight familial PD cases carrying multiple copies of SNCA have been identified [2, 6, 11, 12, 19, 24]. Due to the absence of point mutations in any of the copies of SNCA in these patients, the cause of PD appears to be the mere increase in alpha-synuclein levels. Moreover, in support of a dosage effect, PD patients from families with two extra copies of SNCA have a more severe phenotype than PD patients with only one extra copy [6, 12, 22]. The pathogenicity of multiple SNCA gene copies and the apparent dosage effect of alpha-synuclein levels in both sporadic and familial PD highlight the clinical significance of the regulation of SNCA gene expression.
Regulation of gene expression can take place at the level of transcription, by controlling mRNA synthesis, as well as post-transcriptionally, by controlling mRNA stability and translation. Studies on the transcriptional regulation of SNCA have been primarily centered on the NACP repeat, which is believed to affect SNCA transcription . Numerous association studies have established a link between increased risk of PD and NACP alleles that increase SNCA expression .
Increased SNCA expression could also result from misregulation of its post-transcriptional control. Post-transcriptional control of gene expression can be mediated by several elements, many of which are located in the 3′UTR of mRNAs [1, 15, 26]. Despite a series of studies that report an association of polymorphisms at the 3′ end of SNCA with sporadic PD [16-18, 23, 27], the precise function (and mature sequence) of the 3′UTR of human SNCA mRNA is not known.
The goal of this study was to determine whether the 3′UTR of human SNCA can affect gene expression. To find evidence of important regulatory elements, we first examined the degree of sequence conservation of the 3′UTR of SNCA across multiple species. We then performed experiments that were designed to identify the sequence that constitutes the functional 3′UTR of the human SNCA message. Finally, using luciferase reporter assays, we identified a single nucleotide polymorphism in the 3′UTR of human SNCA that has a significant impact on gene expression.
NM_000345 is the human SNCA mRNA sequence used as a reference throughout this study. The 3′UTRs of SNCA orthologs (Fig. 1) were obtained from the following NCBI entries L33860 (canary), NM_204673.1 (chicken), XM_001496904 (horse), XM_855879 (dog), NM_000345 (human), XM_1162591 (chimpanzee), XM_001095402 (rhesus monkey), NM_001034041.1 (cow), NM_001037145.1 (pig), NM_009221.2 (mouse), NM_019169.2 (rat), NM_001087154.1 (Xenopus laevis). Mutliple sequence alignments were performed using ClustalW2 (http://www.ebi.ac.uk/Tools/clustalw2) with the gap extension penalty set to 0.05 and the gap distance penalty was set to 1. Sequence similarities were visualized using the BOXSHADE program (http://www.ch.embnet.org/software/BOX_form.html). The last 337 nucleotides shown in Fig. 2 were obtained from contig NT_016354 (NCBI). RepeatMasker (http://www.repeatmasker.org) identified repeats in the 3′UTR of SNCA.
Global sequence alignments of chicken-human ortholog pairs were performed using a version of FASTA (GRASTA: ktup = 3, opt cutoff = 100) on sequences previously identified . Sequence pairs were used if the 3′UTRs of both orthologs were longer than 50 nucleotides in length after the trimming of terminal adenosine residues; The final data set consisted of 426 ortholog pairs. For local sequence alignments, chicken-human ortholog pairs from HomoloGene (NCBI) were identified by text matching of gene symbols. We obtained 2653 pairs of orthologs for which both species had unambiguously associated, experimentally verified RefSeq mRNA records and where both of the paired sequences had trimmed 3′UTRs greater than 50 nucleotides in length. The sequences were subjected to pairwise BLAST alignment (bl2seq version 2.2.19); all masking and low-complexity filtering options off. Only non-overlapping, plus-strand alignments with E-values < 1 × 10−6 were accepted.
All expressed sequence tags (ESTs) were obtained from the Unigene cluster Hs271771 (NCBI). ESTs that did not align with fully-spliced SNCA mRNAs and did not contain a polyA tail were excluded.
RT-PCR analysis (Amersham Pharmacia, 27-9261-01) was performed on RNA extracted from the brain of [Tg(SNCAWT); Snca-/-]  mice. These mice are deleted for the murine Snca gene and contain a P1 artificial chromosome (PAC) that carries the entire SNCA including several kb of upstream and downstream genomic sequence. The mice were sacrificed following NIH guidelines (protocol number G-98-13). RNA was extracted from 14 days old mouse brain using Trizol (Invitrogen, 15596-026). Oligo(dT)18-NotI was used for first-strand synthesis (Amersham Pharmacia 27-9261-01) and appropriate samples were treated with DNaseI (Gibco, 18068-015). PCR amplification was carried out using anchored oligo(dT)20 (Invitrogen 12577-011) or TTAAGGAACCAGTGCATACCAAAACACA (R1) and one of the following forward primers: CTACGAACCTGAAGCCTAAGAAAT (F1) or ATTTTATTTTTATCCCATCTCACT (F2).
The first 574 nucleotides of the 3′UTR of human SNCA was amplified using the primers TCTAGAGAAATATCTTTGCTCCCAGTTT and GGATCCTAAAGTGAGATGGGATAAAAATAAAAT from DNA extracted from human fibroblasts using standard protocols. The full length 3′UTR was cloned into the pGL3-Promoter (Promega E1761), after XbaI/BamHI excision of the SV40 late polyA signal. The open reading frame (ORF) of luciferase of the pGL3-Promoter was replaced with the ORF of Luc2CP obtained from pGL4.16 (Promega E6711).
All SNPs in the 3′UTR of human SNCA were obtained from dnSNP (NCBI). Haplotypes and linkage disequilibrium information across the 3′UTR of human SNCA were retrieved from the HapMap Genome Browser (www.hapmap.org) and the NIEHS Environnmental Genome Project Web site (http://egp.gs.washington.edu/). The minor alleles of all SNPs were introduced into luciferase constructs using the QuikChange Multi Site-Directed Mutagenesis kit (Stratagene 200515). The mutagenesis primers used are shown below: ATCAGCAGTGATGGAAGTATCTG TACCT (for rs10024743), TATGAAATTTTACCATTTTGTGATGTG (for rs17016074), TCCCTTTCACTGAAGCGAATACAT GGTAGCAGG (for rs34825021), TCCCTTTCACTGAAGTGAATACATAGTAGCAGG (for rs35716318) and TTATAAGATTTTTAGGTGTTTTTTAATGATACTGTC (for rs35733299).
SHSY5Y cells (ATCC CRL-2266) were transfected with 1 ug of luciferase construct and 100 ng of Renilla plasmid (Promega E2231) using FuGENE (Roche 11-814-443-001). Cell lysates were analyzed using the dual luciferase assay kit (Promega E1910) and the semi-automatic luminometer Lumat LB 9507 (EG&G Berthold).
To gain an understanding of the overall functional significance of the 3′UTR of SNCA, we examined its degree of conservation throughout evolution. Alignment of the 3′UTR of SNCA from twelve species revealed sequence blocks that have remained unchanged over millions of years (Fig. 1).
To quantify the degree of SNCA 3′UTR conservation seen across species, we subjected the 3′UTRs of 426 chicken-human ortholog pairs to global sequence alignment. We determined that the 3′UTR of SNCA was the 24th most highly conserved 3′UTR of the sequences analyzed, yielding a 610-nucleotide alignment with 71.6% identity. To extend this analysis, we performed local sequence alignments on a larger set of ortholog pairs to delineate regions of unusually strong 3′UTR conservation between chicken and human. About 25% of 2653 orthologous pairs yielded one or more local alignments of 80% identity or greater, indicating that conservation between chicken and human orthologs in limited regions of the 3′UTR is not a rare occurrence. However, the SNCA 3′UTR sequences displayed an atypically high level of conservation, with three significantly scoring local alignments: 88.7% identity (106 nucleotides), 86.2% identity (87 nucleotides), and 91.2% identity (34 nucleotides). We considered alignment quality (E-value) and relative coverage of length in evaluating the local alignment results, and SNCA ranked in the top 10% of chicken-human orthologs analyzed with respect to both criteria. These results suggest that human SNCA contains several functional elements that are distributed throughout its 3′UTR.
To determine the pattern of alternative polyadenylation in the 3′UTR of human SNCA, we identified all putative polyA signals present in the 3′UTR of human SNCA (Fig. 2). Of the eight polyA signals predicted, three are detected in a short stretch of 25 nucleotides and are referred to herein as the “polyA triplet”. The eighth polyA signal is located downstream of a LINE element that is also present in the 3′UTR of some human SNCA transcripts (Fig. 2).
Alignment of all ESTs of human SNCA obtained from various tissues revealed that the eight putative polyA signals mediate four polyadenylation events (Fig. 3). EST analyses showed that the 3′UTRs of 95% of SNCA mRNAs are up to 574 nucleotides long (Fig. 3 and Table 1). This suggests that the first 574 nucleotides of the 3′UTR of SNCA are the most relevant for human physiology.
RT-PCR analyses using total mouse brain RNA confirmed the polyadenylation pattern shown in Fig.3 (data not shown). RT-PCR analyses additionally confirmed a much higher abundance of SNCA mRNAs that are 574 nt long (data not shown).
Table 1 summarizes the polyA signal usage in the 3′UTR of human SNCA and includes information on predicted GU-rich elements. GU-rich elements located downstream of polyA signals are known as downstream sequence elements (DSE) and can affect polyadenylation efficiency [5, 7, 14].
Due to the clinical importance of increased alpha-synuclein expression in Parkinson disease, we retrieved all single nucleotide polymorphisms (SNPs) present in the first 574 nucleotides of the 3′UTR of human SNCA (Fig. 4a) and examined their effect on luciferase expression in neuroblastoma SHSY5Y cells. Of the five SNPs examined, only rs17016074 resulted in an increase in luciferase expression over numerous experiments performed under diverse conditions (Fig. 4b). The average increase in luciferase activity caused by the presence of the minor allele of rs17016074 is 32.8% (Fig. 4b). A single nucleotide change at the polymorphic locus rs17016074 can therefore affect reporter gene expression in neuroblastoma SHSY5Y cells.
If rs17016074 can affect expression of endogenous SNCA in humans, it could also affect susceptibility to Parkinson disease. Due to an apparent lack of recombination events, the low frequency of the minor alleles of most of these five SNPs and their differing frequency in various human populations (data available through NCBI, HapMap, and the NIEHS Environmental Genome Project Web sites), haplotypes containing the minor allele of rs17016074 with more than one minor allele of the four other SNPs are not found. We conclude, therefore, that combinations of minor alleles at the other four SNPs are unlikely to affect SNCA expression in vivo in humans.
In this study, we have shown that human SNCA mRNA molecules have any of four different 3′UTR sequences. We have also shown that the first 574 nucleotides of the 3′UTR of human SNCA are highly conserved and included in 95% of its mRNAs. Finally, we have identified a single nucleotide polymorphism, rs17016074, that can increase reporter gene expression in neuroblastoma SHSY5Y cells. The presence of a functional SNP in the 3′UTR of human SNCA qualifies the SNP and the entire SNCA 3′UTR as potential susceptibility loci for Parkinson disease.
Rs17016074 is located between the two most frequently used polyA sites in the 3′UTR of human SNCA (Fig. 4a). Rs17016074 is located immediately upstream of (but not in) a downstream sequence element, which could affect polyadenylation efficiency at the immediately upstream polyA signal (Table 1). The effect on reporter gene expression of rs17016074 suggests that a functionally important element is located between nucleotides 473 and 528 of the 3′UTR of human SNCA. Therefore, it contains elements that can indeed affect gene expression, a finding that is consistent with the biological importance suggested by the high conservation of the 3′UTR of SNCA over millions of years.
Knowing the 3′UTR sequence of SNCA mRNAs lays the framework for understanding the importance of genetic variation at the 3′ end of SNCA in Parkinson disease. Several studies have reported an association of the 3′end of human SNCA with sporadic PD [16-18, 23, 27]. To date, the molecular details of these association studies were not clear, because the entire 3′UTR sequence of SNCA mRNAs in both cases and controls was not known. Of all the polymorphisms that are reported to be associated with sporadic PD in these studies, only rs356165 is actually present in the 3′UTR of SNCA mRNAs. Rs356165 is located in the LINE element of the 3′UTR of human SNCA (Fig. 2 and Fig. 3) and is therefore included in only 5% of SNCA mRNAs (Fig. 3). The minor allele of rs356165 could, therefore, have a direct effect on SNCA expression only by affecting a small minority (about 5%) of SNCA mRNAs in which it is present. Alternatively, it is possible that the association between rs356165 and sporadic PD is driven by other SNPs in SNCA that are functionally important and in linkage disequilibrium with rs356165. The minor allele of rs17016074 is unlikely to be such a polymorphism, since none of the published association studies included PD patients of African descent, where the minor allele of rs17016074 is detected. It is noteworthy that Parkinson disease is inadequately studied in people of African descent (African Americans and sub-Saharan Africans). Rs17016074 is therefore a novel, clinically important polymorphism identified as a result of our studies.
In summary, we have determined the sequence of the 3′UTR present in SNCA mRNAs in humans and identified a single nucleotide polymorphism that is of potential clinical importance for Parkinson disease. Identifying polymorphisms that can increase alpha-synuclein expression may lead to an understanding of the etiology of sporadic Parkinson disease, which constitutes ~90% of all PD cases. As the total number of sporadic PD patients increases due to increase of the human lifespan, polymorphisms that stimulate alpha-synuclein expression become even more clinically significant.
We would like to thank Yien-Ming Kuo for help throughout this project and critical review of this manuscript and Lowell Umayam, Gabriel Renaud, and Anh-Dao Nguyen for bioinformatics support. This research was supported in part by the Intramural Research Program of the National Human Genome Research Institute, National Institutes of Health and by the Department of Medicine, UCSF School of Medicine.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.