|Home | About | Journals | Submit | Contact Us | Français|
D.D.L. and R.B.D. wrote the paper. D.D.L., A.M., and J.J.F. did the biochemical and CLIP experiments. J.U. and M.K. developed ASPIRE2 and analyzed exon junction array data. D.D.L., S.W.C., X.W. and R.B.D. did bioinformatic analysis. D.D.L, J.C.D. and R.B.D. analyzed the data. T.A.C., A.C.S. and J.E.B. developed the exon junction microarray.
Protein-RNA interactions play critical roles in all aspects of gene expression. Here we develop a genome-wide means of mapping protein-RNA binding sites in vivo, by high throughput sequencing of RNA isolated by crosslinking immunoprecipitation (HITS-CLIP). HITS-CLIP analysis of the neuron-specific splicing factor Nova2 revealed extremely reproducible RNA binding maps in multiple mouse brains. These maps provide genome-wide in vivo biochemical footprints confirming the previous prediction that the position of Nova binding determines the outcome of alternative splicing; moreover, they are sufficiently powerful to predict Nova action de novo. HITS-CLIP revealed a large number of Nova-RNA interactions in 3′ UTRs, leading to the discovery that Nova regulates alternative polyadenylation in the brain. HITS-CLIP, therefore, provides a robust, unbiased means to identify functional protein-RNA interactions in vivo.
The discovery of RNA molecules with catalytic activity1 led to the hypothesis that from the earliest life forms, RNA regulation evolved to play critical roles in living organisms2-5. Efforts to develop comprehensive understanding of protein-RNA interactions in vivo have combined genetics, bioinformatics, microarray-profiling, and biochemical approaches. However, the latter have been hampered by methodologic problems;6,7 for example, co-immunoprecipitations can lead to re-association of protein-RNA complexes in vitro8, non-specifically bound RNAs, and additional co-precipitating RNA binding proteins (RNABPs)9.
We have taken a different approach toward understanding protein-RNA interactions by developing a crosslinking protocol that works in tissues, and can therefore be applied prior to protein purification. This method, termed CLIP10,11, uses UV-irradiation to induce covalent crosslinks between protein-RNA complexes in situ, allowing rigorous purification of RNABPs along with small fragments of RNA, which can be amplified and sequenced. CLIP has been used to study direct protein-RNA interactions extant in living cells11-13, including identification of RNA targets11 for the KH-type RNABP Nova14,15, and the discovery of hnRNPA1-dependent regulation of a miRNA12.
Genome-wide efforts to understand Nova function, using exon junction microarrays and bioinformatic analysis of Nova binding sites (YCAY clusters15, characterized biochemically16 and crystallographically17) suggested that the position of Nova binding to pre-mRNA predicted its action to enhance or inhibit alternate exon inclusion18. To identify direct Nova-RNA interactions in vivo, we applied high throughput sequencing methods to CLIP. Here we demonstrate that this approach uncovers new biology in the brain, identifying functional interactions that mediate tissue-specific alternative RNA processing.
We studied Nova-RNA interactions in the mouse neocortex, which expresses the Nova2 protein19. We identified 2,481 Nova-bound RNAs (CLIP tags)10,11 from five experiments using traditional CLIP strategies10,11 and 412,686 CLIP tags from three experiments using high throughput pyrosequencing. Tags were filtered to eliminate those with imperfect (<80%) matches to genomic sequences, with multiple genomic hits, or that were exact duplicates. The resulting set of 168,632 unique tags included 123,734 tags mapping to mRNA-encoding genes and 44,898 tags mapping to intergenic regions (Supplemental Fig. 2).
As negative controls, we repeated Nova CLIP using Nova KO brain or an irrelevant antibody, but were unable to amplify PCR products. We also sequenced 43,000 crosslinked RNA tags remaining after Nova immunoprecipitation, corresponding to a sample of all remaining RNABP-RNA interactions in the brain, and compared the frequency of Nova binding sites15 in these control tags with that in Nova HITS-CLIP tags. Only the latter showed enrichment for YCAY sequences (observed:expected YCAY frequency was 3.56 for Nova tags, compared to 0.99 for the control tags, determined by Chi-square distribution, p<10−227; see Methods), demonstrating the specificity of CLIP.
We reasoned that one way to distinguish between biologically robust and transient Nova-RNA interactions would be to assess the reproducibility with which HITS CLIP tags were identified in individual mice. Tags obtained from the neocortex of two P13 littermates showed a remarkable degree of similarity; when equal numbers were aligned across the entire mouse genome in 10kb windows, a high correlation was evident both graphically (Fig. 1a) and statistically (R2=0.75). To more accurately assess reproducibility, we focused on sites containing overlapping tags (“clusters”). 19,156 clusters had at least two tags, 508 had 20 or more tags (Supplemental Fig. 3a), and 608 RefSeq transcripts had ≥ 6 clusters (Supplemental Fig. 2). Inter-animal clusters—sites containing at least one tag from each littermate—were highly reproducible; over 90% (9,697/10,740) of sites containing tags from one animal also had at least one littermate tag. Finally, inspection of individual chromosomes and genes revealed that Nova clusters were highly reproducible in both position and extent of crosslinking, and specific to a subset of brain-expressed RefSeq genes (Fig.1 and Supplemental Fig. 2).
To determine how faithfully Nova CLIP tag clusters reflect previously defined consensus Nova binding sites15, we analyzed them for consensus motifs by MEME analysis, and found they were significant enriched in YCAY motifs (Supplemental Fig. 3). This was evident across all 19,156 Nova CLIP tag clusters (3.9-fold; p<10−227), and in tags associated with functional Nova interactions (see below). Taken together, these observations indicate that HITS-CLIP reproducibly identifies discrete, YCAY-rich, Nova binding sites in mouse brain RNAs, and suggests that these binding sites may point to positions of functional Nova-RNA interactions.
HITS-CLIP offered a chance to compare predicted sites of Nova-RNA regulation derived from bioinformatic and microarray analysis11,18,20 with interaction sites observed by in vivo crosslinking. 39 previously validated20 Nova2-regulated transcripts harbored Nova CLIP tags (ranging from 1 to 96 tags) within 3 kb of the alternative exon local region (bounded by the constitutive splice donor and acceptor exons) and 34 of these harbored CLIP-tag clusters. The position and YCAY content (4.1-fold enrichment; p<10−156) of these clusters was consistent with the predicted Nova bioinformatic map18. For example, YCAY-rich HITS-CLIP clusters were present downstream of the known Nova2 target Grin1 exon 19 (E19; Fig 1b-c(ii))20, in a position previously predicted by the Nova bioinformatic map18 (Supplemental Fig. 4).
We also observed HITS-CLIP tags in Grin1 upstream of an alternative exon (exon 4; E4) that was not a previously known Nova target. The position of these tags predicted Nova-dependent inhibition of E4 inclusion, which was confirmed experimentally (Fig. 1b-c(i)), suggesting that HITS-CLIP might provide a general means to identify new sites of protein-RNA regulation. Six additional transcripts with Nova HITS-CLIP clusters near regulated splice sites were tested; each was aberrantly spliced in Nova2 KO compared to WT brain in a manner conforming to the Nova bioinformatic map (Supplemental Fig. 5).
To further assess how the position of Nova binding related to the outcome of such splicing events, we analyzed Nova HITS-CLIP tags in Nova-regulated exons newly identified using an updated version of exon-junction microarrays20 harboring probesets for exon junctions in ~145,000 transcripts. Arrays were interrogated with RNA from WT or Nova2 null neocortex, and results analyzed with ASPIRE2, a revision of the ASPIRE algorithm20 that searches for reciprocal changes in exon-included and exon-excluded probesets. We identified 32/45 previously validated20 Nova2-dependent exons, and 46 new candidates with |ΔI| values ranging from 0.19 - 0.60 and with characteristics seen previously20 (Supplemental Fig. 6, Supplemental Tables 1-2). To simplify subsequent analysis, we focused on 35 cassette exons, and confirmed that alternative splicing was Nova2-dependent in 7/7 (Supplemental Fig. 4).
We generated a map in which we placed all 1,085 Nova CLIP tags identified from a total of 71 Nova2-regulated cassette exons (43 validated targets, and 28 newly predicted targets with ΔI>0.2 and ΔI-tTest>25; see Methods) onto a single composite pre-mRNA (Fig. 2a; Supplemental Fig. 7). These tags spanned 11.5kb, but were very heavily concentrated around splice sites, in positions that corresponded extremely well with the bioinformatically predicted Nova map18, and with prior biochemical analysis of Nova-dependent splicing21 22 23 (Fig. 2a). Furthermore, clusters in these regions showed a 3.4-fold enrichment in YCAY elements (p<10−174), with 72 of 123 clusters containing at least 3 YCAY elements within 30 nt, consistent with prior biochemical data21 22 23.
We also noted some HITS-CLIP tags in unanticipated regions. For example, we observed frequent binding of Nova in intronic sequences upstream of Nova-regulated exons. However, binding to these sites was only robust in a limited number of transcripts (Fig. 2a; Supplemental Fig. 7). To generate a map representative of consensus Nova action, we normalized our data, first to the number and distribution of CLIP tags between transcripts, and then to the number of different transcripts with tags at a given position (complexity). This allowed us to focus on potential regulatory binding sites common to several transcripts. This “normalized complexity” map (Fig. 2b) demonstrated that Nova CLIP tags corresponded very precisely to the bioinformatically predicted sites of Nova action (Fig. 2b, insets). We conclude that HITS-CLIP confirms the hypothesis that Nova binding occurs directly on YCAY-rich elements near splice sites in vivo, and that the position of such Nova binding determines the outcome of Nova-dependent splicing regulation.
We next explored whether other HITS-CLIP clusters might reveal new Nova functions. Analysis of the genomic position of Nova clusters revealed that 23% of Nova HITS-CLIP tags mapped to intergenic regions (Fig. 3a). To examine the possibility that these tags may correspond to previously undescribed isoforms of RefSeq genes with alternative terminal exons, we examined the distance between intergenic clusters and neighboring RefSeq genes. There was an exponential increase in the cumulative number of tags within 10kb downstream of known stop codons, compared to linear increases beyond 10kb (717 versus 101 clusters within 10kb of the stop or start codon, respectively; Fig. 3b), or upstream of known start codons. This suggests that in addition to binding known 3′ UTR's (Fig 3a), Nova binds to unannotated 3′ UTR extensions of known genes. Within 3′ UTRs, tags were enriched near poly(A) sites, and to a lesser degree near stop codons (Fig. 3c). A large number of clusters were positioned within a few hundred nt of poly(A) sites (Fig. 3d), a region that contains core and potential auxiliary elements controlling transcript termination and poly(A) site utilization24,25.
These observations suggested that Nova might function in a second pre-mRNA processing event in the mouse brain, regulated poly(A) site utilization (alternative polyadenylation), a process about which little is known. We analyzed alternative polyadenylation by hybridizing Affymetrix Exon Arrays with Nova2 WT versus KO brain RNA, and screened for changes in alternate 3′ UTR relative to total mRNA abundance (Supplemental Fig. 8). We identified 297 transcripts with such differences (≥1.5-fold; p<0.05); 43 contained 100 3′ UTR CLIP tag clusters, and these were preferentially present near poly(A) sites (Fig. 3d).
We tested poly(A) site use in two candidates, Cugbp2 and Slc8a1. Both have microarray-predicted Nova-dependent changes in 3′ UTR usage (1.5 and 2 fold, respectively), and both contained CLIP tags near poly(A) sites (Fig. 4a; Supplemental Fig. 9). RNase protection analysis (RPA) demonstrated that utilization of these poly(A) sites was increased in Nova2 KO brain (Fig. 4a, 4e; Supplemental Fig. 9); ΔC (the change in percent transcripts cleaved at the relevant poly(A) site, analogous to ΔI)18 for these transcripts was 0.22-0.25 (for example, 41% to 66% utilization of pA2 in Cugbp2 transcripts in WT vs. Nova2 KO brain; Fig. 4a), comparable in magnitude to Nova-dependent changes in alternative exon usage. Furthermore, the increase in proximal poly(A) use in Cugbp2 and Slc8a1 transcripts in Nova2 KO brain was associated with reciprocal decreases in processing at distal poly(A) sites, suggesting that changes in the relative levels of alternatively polyadenylated Cugbp2 and Scl8a1 mRNAs are not due to differences in isoform stability, but result directly from aberrant poly(A) site utilization in the Nova2 KO.
We used qRT-PCR to measure the relative abundance of alternative poly(A) isoforms of 29 additional candidate Nova targets (from Fig. 3d). 12 transcripts had significant changes in levels of alternatively polyadenylated transcripts (p<0.04 in 11/12; Fig. 4b). These transcripts did not change in overall transcript abundance (data not shown); 9 were consistent with a Nova-dependent action to block, and 3 to enhance utilization of the adjacent poly(A) site. 17 transcripts had either no change in poly(A) site usage in Nova2 KO brain (most of these were low abundance (at least in alternate 3′ isoforms)), and/or had confounding changes in overall steady-state transcript levels, and 2 transcripts had 3′ UTR changes as well as alternate splicing of terminal exons. We mapped Nova CLIP tags from the 12 Nova-regulated 3′ UTRs onto a composite transcript containing an alternative polyadenylation site (Fig 4c), and onto a normalized complexity Nova-RNA 3′ UTR interaction map (Fig. 4d). This revealed reproducible Nova binding to discrete YCAY-rich (3.5-fold, p<10−227) regions flanking Nova-regulated alternative poly(A) sites. Taken together, the quantitative analysis of transcript levels and the HITS-CLIP map demonstrate that Nova binds to YCAY-rich elements flanking poly(A) sites and is necessary for their proper regulation in mouse brain.
To test whether Nova binding to these 3′ UTR's is sufficient to suppress poly(A) site utilization, we generated a GFP reporter construct containing parts of the Slc8a1 3′ UTR harboring alternative poly(A) sites (Supplemental Fig. 9b), as well as mutant constructs in which YCAY elements were mutated to YACY (a sequence to which Nova does not bind15). Co-transfection of Nova2 expressing constructs with these reporters into 293T cells (which do not express Nova22) demonstrated a Nova and YCAY-dependent reduction in alternative poly(A) site usage of the same magnitude and direction as seen in WT versus Nova2 KO neocortex (Fig. 4e). Taken together, these results identify direct Nova-RNA interactions in the 3′ UTR that regulate brain-specific alternative RNA processing.
Genome-wide screens have been used to establish correlations between the action of RNABPs and biologic diversity6,7,26-30, but are unable to identify direct sites of RNA regulation. HITS-CLIP provides a general solution to this problem by generating a transcriptome-wide biochemical “footprint” of protein-RNA interactions in living tissues. This in turn allows a direct comparison of predicted (e.g. microarray or bioinformatically derived) and observed (HITS-CLIP) sites of action, and thereby provides a new platform for deriving functional RNABP maps and for assessing models of protein-RNA regulation.
HITS-CLIP extends our transcriptome-wide understanding of Nova-RNA interactions, which was previously limited to bioinformatic analysis of YCAY clusters within 200nt of alternate or bounding constitutive exons18. Analysis of HITS-CLIP tags mapping to 71 Nova-regulated exons (Fig. 2) yielded a more refined map of Nova action. Over 91% of the normalized Nova binding associated with exon inclusion (Fig. 2b) occurred within 500 nt of either the alternative 5′ or constitutive 3′ splice sites, while 74% of the normalized Nova binding associated with exon exclusion (Fig. 2b) occurred within 500 nt of the constitutive 5′ splice donor or surrounding the alternate exon. This strengthens the conclusion that the position of Nova-RNA interaction determines the outcome of splicing, an observation that may extend to splicing factors more generally31. Importantly, the strength of these correlations suggests that the HITS-CLIP map is sufficiently robust to predict protein-RNA regulation, as shown for seven new Nova splicing targets (Fig. 1d and Supplemental Fig. 5).
While the majority of the Nova regulated sites conform to a general set of rules based on direct Nova binding, there are also clear exceptions. For example, Nova binds robustly but in atypical positions in several regulated transcripts (e.g. Brsk2, Rap1gap) (Fig. 2a-b; Supplemental Fig. 7). Such examples may point to new mechanisms of Nova action, which may include interactions with other RNABPs. For example, PTBP2 interacts with Nova and other RNABPs such as KSRP to modulate the outcome of alternative splicing32,33. In addition, RNA structure may regulate or be impacted by interaction with other factors, as suggested by the ability of the splicing factor MBNL1 to stabilize RNA hairpins34, from analysis of splicing defects in MAPT that underlie frontotemporal dementia with parkinsonism35, and from structural studies of competition between hnRNP F and PTB binding to the src transcript (F. Allain, personal communication).
The unbiased nature of HITS-CLIP led to the unexpected identification of Nova binding near poly(A) sites and the recognition of its role in regulating alternative polyadenylation in the brain. The presence of such tissue-specific factors was postulated after the recognition of differential polyadenylation of IgM heavy chain transcripts in B cells36 and of calcitonin/CGRP pre-mRNA in neurons37. Alternative poly(A) sites are present in ~50% of human genes38, and their regulation is believed to play an important role in tissue and developmental mRNA regulation39,40 and in human disease41. In particular, brain mRNAs appear to be preferentially processed at promoter-distal poly(A) sites to generate long 3′ UTRs42,43. Interestingly, in 9 of 12 instances examined (Fig. 4b) Nova promoted the production of mRNAs with long 3′ UTRs. Thus one important action of Nova may be to generate long 3′ UTRs in neurons, which may be subject to regulation by miRNAs or other RNABPs.
Numerous links have been made between pre-mRNA splicing and 3′-end processing44,45, such as the observation that the splicing factor sex-lethal can regulate polyadenylation by competing with CstF64 for RNA binding46. While we found two transcripts that had both Nova-dependent changes in splicing and polyadenylation, Nova can mediate splicing-independent alternative polyadenylation. For example, Nova suppresses the Slc8a1 pA2 site in an intronless transcript (Fig. 4), and Cugbp2 and Slc8a1 alternative polyadenylation was not coupled to alternative splicing in brain (unpublished data).
The Nova HITS-CLIP map offers insight into the mechanism of poly(A) site selection in the brain. Changes in the accessibility of core (e.g. CPSF and CstF) or auxiliary factors to interact with cis elements surrounding the poly(A) site underlie the regulation of alternative polyadenylation24,25,47,48. We find no evidence that Nova regulates transcripts encoding such factors (including subunits of CPSF, CSTF, CF-1 and CF-2). Instead, our data point to Nova as a trans-acting factor that binds YCAY elements flanking regulated poly(A) sites, and that the position of Nova binding may determine whether it acts to promote or inhibit poly(A) site use (Fig. 4). For example, Nova CLIP tags overlap the canonical CPSF and/or CstF binding sites within 30nt of the Cugbp2 and Slc8a1 poly(A) sites, which are suppressed by Nova. In contrast, in transcripts where Nova enhances poly(A) site use, it binds to more distal elements, where it may antagonize the action of auxiliary factors. Therefore the position of Nova 3′ UTR binding may determine the outcome of poly(A) site selection in a manner analogous to its action on splicing regulation (Supplemental Fig. 1).
In summary, HITS-CLIP offers a powerful new platform for studying RNA regulation in vivo. This genome-wide biochemical approach complements bioinformatic, microarray and genetic studies. HITS-CLIP is able to identify biologically relevant interactions, providing a focus on direct protein-RNA contacts as critical points for understanding RNABP function. The unbiased nature of the platform holds the potential for new discovery, including the elucidation of preferred binding sequences and the identification of regulated RNA substrates. Identifying Nova as the first vertebrate factor to regulate alternative polyadenylation in mouse brain demonstrates that a single factor can regulate different aspects of tissue-specific RNA metabolism. Finally, the reproducible nature of HITS-CLIP suggests that it provides a robust platform to explore RNABP-dependent mechanisms of gene expression in complex and dynamic scenarios.
CLIP was performed on mouse Nova2 WT and KO (CD1) brains as described11. After PCR amplification, high throughput sequencing was performed (454 Life Sciences).
For analysis of Nova-dependent alternate splice and alternate 3′ UTR variants, a custom exon junction array (Affymetrix) or MoEx 1.0 ST Affymetrix Exon arrays, respectively, were used.
CLIP tags and clusters were analyzed with BED or WIG formatted custom tracks using the UCSC Genome Browser and Genome Graph tools (genome.ucsc.edu). Composite maps were generated by determining the distance between tags and closest splice sites within the alternative exon local region and converted to coordinates in a BED format custom track, with tags from each gene assigned different colors. MEME sequence analysis was done using tools available at meme.sdsc.edu. ASPIRE2 was based on ASPIRE20.
Biochemical assays were done using biologic triplicate sibling mice, unless otherwise noted. RPAIII kits from Ambion were used, and RT-PCR were done as described21 22 23, with modifications described in Methods; qPCR was done with a MyIQ BioRad thermal cycler and data analyzed as described in Methods. WT or mutant GFP alternative polyadenylation reporters were transfected into 293T cells in the presence or absence of pNova2 (described in Methods).
The authors are grateful to members of the Darnell and Ule laboratories and Joel Richter for critical discussions and review of the manuscript, Brad Friedman for suggesting the use of Exon Arrays, and Mayte Suarez-Farinas for statistically significant help. Supported by NIH R01 NS34389 (RBD) and the Howard Hughes Medical Institute. RBD is an HHMI Investigator.