|Home | About | Journals | Submit | Contact Us | Français|
Recent genome-wide studies have revealed a remarkable correspondence between nucleosome positions and exon-intron boundaries, and several studies have implicated specific histone modifications in regulating alternative splicing. In addition, recent progress in cracking the ‘splicing code’ shows that sequence motifs carried on the nascent RNA molecule itself are sufficient to accurately predict tissue-specific alternative splicing patterns. Together, these studies shed light on the complex interplay between RNA sequence, DNA sequence, and chromatin properties in regulating splicing.
The fact that the majority of pre-mRNAs are spliced while they are still being transcribed  has led to the proposal that local chromatin structure and histone modifications may play direct roles in regulating splicing. Indeed, several studies have shown that specific histone modifications can affect the association of splicing factors with chromatin and the efficiency of the splicing process [2-4]. A link between nucleosome positioning and exon-intron boundaries was first proposed in 1991, long before the functional link between splicing and transcription was established . Recent advances in the computational prediction and experimental verification of nucleosome positioning, based on underlying DNA sequence features [6-8], in combination with recently published high-resolution genome-wide maps of nucleosome positions [9,10] and histone modifications [11,12] in several organisms, have now enabled genome-wide comparisons of nucleosome positioning, histone modifications, and intron-exon architecture, revealing surprising correlations between these features and throwing light on the question of how chromatin features may help the splicing machinery to distinguish between exons and introns.
A distinct question is that of alternative splicing. Alternative splicing can generate many different transcripts from a single gene. Recent high-throughput transcriptome analyses have detected alternative splicing in approximately 95% of multi-exon human genes [13-15] and also extensively in mice , plants , flies [18,19], and yeast [20,21]. Alternative splicing is often regulated by trans-acting factors that are differentially expressed in different tissues or metabolic states and bind specific sequence or structural motifs on the pre-mRNA, resulting in alternative splicing [22-24]. It has long been a goal of the splicing field to crack the ‘splicing code’: to identify the pre-mRNA sequence features that can explain and predict not only the exact sites of constitutive splicing but also the features that determine tissue-specific alternative splicing patterns .
Previous work has successfully identified pre-mRNA-encoded features that define the precise boundaries of certain classes of constitutively spliced exons [26-28] and additional splicing enhancers and silencers [28,29], some of which have been shown to correlate with exon inclusion levels in specific tissues [16,28]. However, it has been suggested that pre-mRNA-encoded information alone is not sufficient to explain the recognition of short exons in a desert of long introns or the modulation of this process in alternative splicing . The idea that local chromatin structure may add an additional regulatory layer to splicing, particularly to alternative splicing, is currently gaining momentum, and both correlative evidence and functional evidence are emerging [30-35]. Additional insights into the role of RNA sequence come from the recent demonstration that by taking hundreds of RNA features into account, a ‘splicing code’ based on pre-mRNA features is able to qualitatively predict tissue-specific alternative splicing for thousands of vertebrate exons . Here, recent advances and current ideas on how chromatin and pre-mRNA sequence contribute to constitutive and alternative splicing are reviewed.
In 2009 and 2010, several groups [37-42] used bioinformatic approaches to uncover patterns from published genome-wide maps of nucleosome positioning in human T cells and in Caenorhabditis elegans, derived from deep sequencing of micrococcal nuclease-digested chromatin [9,10]. These recent bioinformatic analyses [37-42] now show that nucleosomes sit preferentially on exon sequences whereas introns are relatively depleted of nucleosomes (Figure 1). This remarkable arrangement was observed in several metazoan species and was found to be independent of transcriptional activity, suggesting that the arrangement is an inherent property of chromatin [37-39,41]. Strikingly, the average length observed for metazoan exons (140-150 base pairs) corresponds neatly with the length of DNA required to wrap a single nucleosome (147 base pairs) [39,41]. Further correlations to the splicing process were observed by several authors (e.g., exons flanked by weak splice sites or by long introns have a higher tendency to be bound by nucleosomes) [40,41], raising the idea that nucleosomes may help the splicing machinery by ‘marking’ exons that may otherwise be difficult to recognize (reviewed in [30-32]).
By analysis of chromatin immunoprecipitation (ChIP) combined with high-throughput sequencing (ChIP-seq) and ChIP followed by microarray analysis (ChIP-chip) data sets [9,11,12], several studies have examined correlations between histone modifications and exon-intron boundaries and have reported conflicting results. Several authors conclude that specific histone modifications are enriched on exons compared with introns, suggesting an active marking mechanism [12,37,40,43], whereas others argue that these apparent enrichments are due mostly to nucleosome positioning, which is independent of modification status [38,39,41]. It has been suggested that these discrepancies may be due mainly to difficulties of normalization of ChIP data for nucleosome occupancy, compounded by the fact that occupancy studies and modification studies were performed with different techniques (micrococcal nuclease digestion versus ChIP) . It is also possible that gene-specific differences exist but go undetected in global analyses.
To what extent can we understand the preference of nucleosomes for exonic sequences in terms of known pre-mRNA splicing signals? Figure 1 summarizes the RNA, DNA, and chromatin features at exon-intron boundaries. The preference of nucleosomes for exons appears to be a consequence not only of the higher guanine and cytosine content of exons [38,39,41] but also of a high density of sequences that repel nucleosomes  exactly at the intron-exon boundary and a depletion of these sequences within the exon itself relative to the adjacent intron  (Figure 1). Particularly interesting is the polypyrimidine tract (PPT), typically a long (10-20 nucleotides) run of uracil bases at the 3′ intron end in the pre-mRNA, that is specifically recognized by spliceosome components (Figure 1). At the DNA level, the PPT corresponds to poly T, one of the strongest nucleosome-repelling sequences identified by Kaplan et al. . Thus, it is clear that splicing signal sequences play a role both at the RNA level, for recognition by RNA-binding proteins of the splicing machinery, and at the DNA level, in determining the positions of nucleosomes. An additional role of splice site sequences in increasing RNA flexibility has been proposed . Is nucleosome positioning thus merely a coincidence of the RNA sequence or does it have a causal role in splicing? Experimental evidence beyond correlation is currently lacking but several possible roles have been proposed (summarized in Figure 2).
The models proposed so far fall into one of two non-mutually exclusive classes - the ‘recruitment’ models and the ‘kinetic coupling’ models  - and are summarized in Figure 2. Recruitment models favor the idea that nucleosome positioning or nucleosome modifications on exons guide the splicing machinery to the right place (Figure 2), whereas kinetic coupling proposes that the presence of a nucleosome in the path of the polymerase decreases the speed of transcription and thus allows more time for splicing to occur (Figure 2, arrow 5). In favor of such ‘speed bump’ models , single-molecule in vitro experiments have shown that the speed of RNA polymerase II (Pol II) transcription is modulated by nucleosomal barriers . On the other hand, a recent in vivo study of Pol II transcription rates measured similar speeds on exonic and intronic sequences and showed that although splicing does indeed occur cotranscriptionally, it lags substantially behind the transcription process, being approximately twice as slow . This raises the question of how much the chromatin features ahead of the polymerase can influence splicing events that take place far behind it (Figure 2). However, this study was limited to a small number of genes, and it is not known whether they contain exons that are regulated in a chromatin or Pol II elongation-dependent manner (or both).
If nucleosome positions do affect splicing, then how could such a system permit the vast amount of flexibility observed in alternative splicing? Several authors have proposed that nucleosome remodelers, a relaxation of nucleosome positioning at alternatively spliced exons, or tissue-specific histone modifications may account for tissue-specific differences in nucleosome positioning or properties and thus facilitate alternative splicing [30-32,39]. In support of this idea, ectopic recruitment of heterochromatin modifications to the vicinity of an alternative exon was found to reduce Pol II speed and to affect the splicing of that exon . Furthermore, investigation of the fibroblast growth factor receptor 2 (FGFR2) locus revealed cell line-specific enrichment of histone H3 lysine 36 trimethylation (H3K36me3), which was shown to be required to promote the exclusion of one exon of the gene . An adaptor protein recruits the proteins required for specific exclusion of this exon to the sites of H3K36 methylation, demonstrating a direct mechanistic link between cell type-specific chromatin modification and exon exclusion. Interestingly, the H3K36me3 enrichments were broadly spread across the FGFR2 locus and were not limited to the excluded exon, suggesting that nucleosome positioning does not play the major role here; it is the H3K36me3 modification that serves to recruit the necessary factors to the general vicinity of an alternatively spliced exon in the appropriate cell type, although sequence features of the pre-mRNA itself determine which exon is to be excluded.
Although it remains unclear to what extent alternative nucleosome positions play a role in alternative splicing, a recent study  demonstrates that pre-mRNA sequence features are an essential component and perhaps contain sufficient information for many alternative splicing events. The authors examined the splicing patterns of over 3000 alternative exons in 27 mouse tissues and extracted over 1000 RNA sequence features, including known and novel motifs and structural features. From this the authors compiled 200 features that were diagnostic and qualitatively predictive for tissue-specific alternative splicing. The predictive power of the code was tested by cross-validation and comparison with experimental data. Depending on the exon type, the code was able to correctly predict alternative splicing in central nervous system and muscle for 65-95% of test exons. Importantly, the code is combinatorial, and the authors conclude that large numbers of sequence features are required to ensure tissue-specific splicing. Although this study focuses on the idea that these are pre-mRNA sequence features, it is also possible that several of the newly identified features may work at the DNA level or at both RNA and DNA levels as is the case for the PPT  (Figure 1).
In the future, it will be essential to determine the relative contributions of RNA sequence features, nucleosome positioning, and histone modifications to constitutive and alternative splicing. The data so far correlating nucleosome positioning to exon-intron architecture have been limited to few cell types. It will be important to see whether nucleosome positions are indeed different in different tissues, as has been proposed via thermodynamic competition with tissue-specific transcription factors  and observed for regulatory regions of specific loci . If this is also the case on a genome-wide scale, then do alternative nucleosome positions correlate with alternatively spliced exons?
Furthermore, it would be extremely interesting to investigate whether any of the RNA sequence features that are predictive for alternative splicing  could play an additional role at the DNA level (e.g., as nucleosome positioning or repelling sequences  or as binding sites for site-specific DNA binding proteins that may compete with nucleosomes in a tissue-specific manner ).
Finally, it will be essential to go beyond global correlations and to determine cause and effect for specific loci. To unravel the relative contributions of RNA sequence, DNA sequence, and chromatin architecture, it will be essential to overcome the inherent difficulty in such experiments (i.e., that any change in DNA sequence changes the RNA sequence). Thus, it will be essential to devise strategies by which nucleosome positions can be modulated without affecting the underlying DNA sequence (e.g., by manipulating levels of remodelers or competing transcription factors).
Work in the author's laboratory is supported by the Austrian Academy of Sciences, the FWF (Fonds zur Förderung der wissenschaftlichen Forschung [Austrian Scientific Research Fund]), Gen-AU (Genomforschung in Österriech [Genome Research in Austria]), The Epigenome Network of Excellence (EU- FP6), and the FCT (Fundação para a Ciência e Tecnologia [Portugese Foundation for Science and Technology]).
The electronic version of this article is the complete one and can be found at: http://f1000.com/reports/b/2/74
The author declares that she has no competing interests.