|Home | About | Journals | Submit | Contact Us | Français|
microRNAs (miRNAs) are essential components of gene regulation, but identification of miRNA targets remains a major challenge. Most target prediction and discovery relies on perfect complementarity of the miRNA seed to the 3′ untranslated region (UTR). However, it is unclear to what extent miRNAs target sites without seed matches. Here, we performed a transcriptome-wide identification of the endogenous targets of a single miRNA—miR-155—in a genetically controlled manner. We found that approximately forty percent of miR-155-dependent Argonaute binding occurs at sites without perfect seed matches. The majority of these non-canonical sites feature extensive complementarity to the miRNA seed with one mismatch. These non-canonical sites confer regulation of gene expression albeit less potently than canonical sites. Thus, non-canonical miRNA binding sites are widespread, often contain seed-like motifs, and can regulate gene expression, generating a continuum of targeting and regulation.
miRNAs direct Argonaute (AGO) proteins to post-transcriptionally repress messenger RNA (mRNA) targets and regulate a broad range of physiological processes (Ambros, 2004; Bartel, 2004, 2009). While gain and loss of function studies have established specific roles of individual miRNAs, the identity of most miRNA targets remains unknown, which limits mechanistic insight into observed phenotypes. For instance, it is unclear whether individual miRNAs target multiple components of the same regulatory pathway, individual components of related pathways, or multiple unrelated pathways.
Early studies using both reporter assays and analysis of downregulated genes in miRNA overexpression experiments revealed that perfect complementarity of the 5′ end of the miRNA (i.e. the “seed” region at positions 2-7) to the 3′UTR of target RNAs is the most common determinant of target specificity (Brennecke et al., 2005; Doench and Sharp, 2004; Lewis et al., 2005; Lewis et al., 2003; Lim et al., 2005). However, many reports suggest regulation of sites without perfect seed complementarity (Betel et al., 2010; Brennecke et al., 2005; Didiano and Hobert, 2006; Lal et al., 2009; Lu et al., 2010; Shin et al., 2010; Vella et al., 2004; Vo et al., 2010). Recently, biochemical identification of AGO binding sites became possible with HITS-CLIP and PAR-CLIP techniques, which combine RNase treatment and AGO immunoprecipitation with high-throughput sequencing to identify sites of AGO binding across the transcriptome (Chi et al., 2009; Gottwein et al., 2011; Hafner et al., 2010; Leung et al., 2011; Skalsky et al., 2012; Zisoulis et al., 2010). In these studies, a sizable fraction of CLIP-identified AGO binding sites did not contain seed matches. However, it was unclear whether the majority of this apparently seedless targeting was caused by miRNA independent mechanisms (Leung et al., 2011) or non-canonical miRNA-target interactions. Thus, the mechanism and gene regulatory potential of these seedless interactions remained in question.
To address miRNA dependent targeting by AGO without reliance on sequence motifs, we combined genetic, biochemical, and computational approaches. We analyzed binding sites using differential HITS-CLIP (dCLIP) and mRNA expression changes in primary cells isolated from mice that are wild type or deficient for a single miRNA. We used miR-155 as a model for the following reasons: miR-155 knockout mice show a marked impairment in T and B cell function (Rodriguez et al., 2007; Thai et al., 2007; Vigorito et al., 2007); miR-155 is highly expressed in human malignancies (Eis et al., 2005; Kluiver et al., 2005; Metzler et al., 2004; Volinia et al., 2006) and its overexpression in pre-B cells (Costinean et al., 2006) or hematopoietic stem cells (O’Connell et al., 2008) leads to oncogenic transformation. Importantly, miR-155 is abundant in activated T cells whereas naïve wild type T cells are devoid of miR-155 and are, therefore, similar to miR-155-deficient counterparts. Thus, we employed AGO dCLIP to identify miR-155-dependent binding sites using wild type (WT) and miR-155-deficient (155KO) primary T cells. Our analysis of differential AGO binding confirmed that exact complementarity to nucleotides 2-7 of the miRNA is present in the majority of miR-155-dependent binding sites. We also found that perfect seed matches are absent in ~40% of miR-155-dependent Ago binding sites. These non-canonical sites, which are undetectable by seed-based miRNA target prediction algorithms, were strongly enriched for inexact seed matches, which contained a mismatch to the seed at a single nucleotide position. Furthermore, non-canonical miRNA binding sites regulate gene expression, albeit less potently than canonical sites.
In contrast to seed matches in the 3′UTR, differential gene expression analysis after miRNA perturbation indicated that, as a group, seed matches within coding regions have little gene regulatory activity (Grimson et al., 2007; Lim et al., 2005). Previous studies using CLIP techniques demonstrated extensive AGO binding to the coding region, although these sites were observed to mediate less regulation than 3′UTR sites (Chi et al., 2009; Gottwein et al., 2011; Hafner et al., 2010; Leung et al., 2011; Skalsky et al., 2012; Zisoulis et al., 2010). However, several groups reported features that lead to more regulation by sites in the coding region (Fang and Rajewsky, 2011; Forman et al., 2008; Gu et al., 2009; Schnall-Levin et al., 2011). For this reason, we investigated whether sites with stringently defined Argonaute binding to coding regions lead to functional gene regulation. Using AGO dCLIP we showed miR-155-dependent AGO binding to a subset of miR-155 seed matches within coding regions. However, these sites did not lead to detectable down-regulation of steady state mRNA levels, leaving open the possibility that these sites may have other functions.
Our AGO dCLIP analysis identified, in a genetically controlled manner, a transcriptome-wide set of direct targets for a miRNA in a specific cellular context and revealed the unexpected prevalence of non-canonical targeting by miRNAs.
Although barely detectable in naïve T cells, miR-155 was dramatically upregulated upon activation of CD4+ T cells induced by TCR and co-stimulatory receptor CD28 ligation (Supplementary Figure 1a). In order to reveal miR-155-dependent binding of AGO in primary T cells, we performed AGO dCLIP using activated CD4+ T cells isolated from WT and miR-155 KO C57Bl/6 mice. We UV-cross-linked protein-mRNA complexes in the activated T cells and followed with partial RNase digestion and stringent immunoprecipitation with an anti-Argonaute 2 antibody. The published AGO HITS-CLIP protocol (Chi et al., 2009), was modified to increase coverage and dynamic range and to reduce sequencing costs (see Experimental Procedures).
Libraries constructed from 12 biological replicates of activated 155KO and WT CD4+ T cells were subjected to high-throughput sequencing and the reads were uniquely mapped to the annotated 3′UTR sequences of the mouse genome (a total of 5,213,578 and 5,334,757 uniquely mapping reads from WT and 155KO libraries respectively) (Figure 1a; see Computational Methods). AGO binding sites were detected using a novel peak-calling algorithm based on edge detection (see Supplementary Methods). This algorithm was essential for identifying overlapping binding sites (Supplementary Figure 2b-d) and is effective for binding site detection in other data sets (Supplementary Figure 2e)
In T cells, we identified 14,634 reproducible AGO binding sites in the 3′UTRs of 4,165 genes (Supplementary Table 1 and http://cbio.mskcc.org/leslielab/clipseq). We limited our analysis to these sites, which contained reads mapped from at least seven of the 12 replicates in either WT or 155KO T cell libraries.
Additional analyses confirmed that the observed changes in AGO binding were a direct effect of miR-155 deficiency. First, principal component analysis revealed that the transcriptomes of 155KO and WT activated T cells were very similar (Supplementary Figure 1b). Second, AGO dCLIP libraries from 155KO cells, which included miRNAs, showed no changes in AGO bound miRNA levels other than miR-155 itself (Supplementary Figure 1c). Therefore, miR-155 loss did not cause major changes in the activation state, miRNA expression, or global transcriptome of activated T cells at the time point of analysis.
To validate AGO dCLIP, we examined 3′UTR AGO binding sites containing miR-155 seed matches. We expected these seed match sites to exhibit more AGO binding in WT cells than 155KO cells (Figure 1b). Indeed, for the vast majority of the known targets of miR-155 (Bolisetty et al., 2009; Costinean et al., 2009; Levati et al., 2011; O’Connell et al., 2009; O’Connell et al., 2008; Rai et al., 2010; Wang et al., 2011) we observed differential AGO binding at miR-155 seed matches. Using this approach, we identified many novel targets including PDL1, the ligand for the key inhibitory receptor PD1. Analysis of all AGO sites containing miR-155 seed matches showed that they had significantly reduced binding in 155KO cells compared to all AGO binding sites (p< 4x10-74) (Figure 1c). The observed differential binding at canonical miR-155 targets confirms that AGO dCLIP detects miR-155-dependent binding sites.
We used a stringent statistical test to discern miR-155-dependent sites from other AGO binding sites. At 191 AGO binding sites in 175 genes there was significantly more AGO binding in WT than 155KO cells (cutoff at p < .01; see Computational Methods and Supplementary Table 2). We next investigated whether differential AGO binding correlated with greater target regulation of AGO binding sites containing miR-155 seed matches (57% of all miR-155-dependent sites). Specifically, we compared expression changes in 155KO T cells for genes with AGO binding at miR-155 seed matches against the subset of genes with significant differential AGO binding at miR-155 seed matches (Figure 1d). As expected, we found stronger regulation of target genes with differential AGO binding at miR-155 seed matches. To quantitatively examine whether differential AGO binding is associated with greater gene regulation at AGO bound miR-155 seed matches, we performed a linear regression analysis, which showed that differential AGO binding at sites with miR-155 seed matches correlated with downregulation (p<0.05) (Supplementary Figure 2f). Nearly all (>87%) AGO binding sites containing miR-155 seed matches had less binding in 155KO cells than in WT cells (using a normalized read count difference of 0). Furthermore, genes with AGO bound miR-155 seed matches were greatly induced relative to genes with miR-155 seed matches at which no AGO binding is detected (Figure 1d). Altogether, these data indicate that dCLIP improves the specificity of CLIP, yet for highly expressed miRNAs, presence of a miRNA seed match within an AGO binding site usually identifies a miRNA target.
In addition to the prevalent binding that occurs in the 3′UTR, we observed reads mapping to other transcribed regions including the 5′UTR, coding regions, and the introns of genes (Figure 2a). Closer inspection revealed that most intronic reads were unclustered or mapped to snoRNAs. AGO binding to snoRNAs was previously described (Ender et al., 2008; Saraiya and Wang, 2008; Taft et al., 2009), and our experiments provide further data concerning AGO binding to transcripts generated by snoRNA loci in mammals.
To determine whether AGO bound coding region sites in a miRNA dependent fashion, we examined sites with miR-155 seed matches. Examining binding sites observed in at least seven of twelve replicates, we observed 137 AGO binding sites containing miR-155 seed matches in the coding regions of 129 genes. Many of these sites had clear miR-155-dependent binding (Figure 2b). Indeed, AGO binding at miR-155 seed matches in the coding region and the 3′UTR was similarly miR-155-dependent (Figure 2c), although there were almost three times as many sites in the 3′UTR. This miR-155 dependence confirms that AGO-miRNA complexes bind to the coding region.
Rare codons 5′ of miRNA binding sites in the coding region contribute to more effective miRNA-mediated regulation (Gu et al., 2009). Consistent with this observation, PAR-CLIP revealed that AGO binding occurs at sites flanked by sequences enriched for rare codons (Hafner et al., 2010). Our data confirms this observation (Supplementary Figure 3). Surprisingly, as was seen in the previous study of AGO binding, rare codons appeared to be more common both 5′ and 3′ of AGO binding sites.
We applied the same stringent statistical test for miR-155 dependence that we applied to 3′UTR sites. This identified 20 genes with sites in the coding region that contained a miR-155 seed and displayed significant miR-155 dependence. However, in contrast to 3′ UTR site-containing mRNAs, these mRNAs were not regulated by miR-155 (Figure 2d). Therefore, although miR-155 AGO complexes bound the coding region, this binding was not sufficient for transcript regulation.
The strength of AGO dCLIP is the ability to identify miRNA dependent binding sites without reliance on seed matches. We found that 34% of 3′UTR AGO binding sites did not have a seed match for any of the 40 highest expressed miRNAs in our cells (Figure 3a and Supplementary Table 3) in agreement with previous reports (Chi et al., 2009; Hafner et al., 2010; Zisoulis et al., 2010). These 40 miRNAs represented more than 75% of the reads mapped to miRNA genes in libraries isolated from WT and 155KO cells (Supplementary Figure 4). Of these miRNAs, the lowest expressed miRNA, had less than 1% of the reads of the highest expressed miRNA. Accounting for AGO binding sites using this many seeds likely overestimates the amount of seed-mediated targeting.
Recent work in ES cells suggested that there are miRNA-independent G rich motifs associated with AGO binding sites (Leung et al., 2011); however, these motifs were found in less than 2 percent of AGO binding sites identified in activated T cells. These results suggested that “non-canonical” miRNA dependent targeting is common.
We next focused on miR-155-dependent AGO binding sites in 3′UTRs. While the majority of these sites contained miR-155 seed matches, ~43% did not (Figure 3a). Increasing the stringency for miR-155 dependence (P < .005), did not substantially alter this percentage of seedless sites (~37%). These candidate non-canonical sites without a miR-155 seed, but with significantly more AGO binding in WT vs. 155KO T cells, were found within genes with (e.g., Hif1a and Trib1) and without (e.g., Unc119b, Cep135, Gimap3) miR-155 canonical sites (Figure 3b,c).
To determine whether miR-155 binds these non-canonical sites directly, we first looked for complementarity to the miR-155 sequence. We implemented a supervised learning strategy to optimize parameters for miR-155/AGO binding site alignments to discriminate between miR-155-dependent binding sites and -independent AGO binding sites (see Supplemental Methods). The optimized parameters rely on extensive complementarity to the seed region of miR-155 to discriminate more than 75% of miR-155-dependent binding sites from non-differential AGO binding sites (average of 10 fold CV; see Supplemental Methods). To identify global patterns in the alignments between miR-155 and miR-155-dependent AGO binding sites, we grouped the alignments based on complementarity to the seed region. The most frequent motifs were represented by sequences with exact complementarity to nucleotides 1-7 (for miR-155 this includes an A at position 1) and nucleotides 1-8. This group of motifs was followed by sequences with complementarity to nucleotides 2-8, and with complementarity to nucleotides 2-7 (Figure 4a). However, in addition to these expected canonical motifs, we found several types of seed-like motifs not previously described at a transcriptome-wide level. These included motifs with inexact complementarity to the miR-155 seed such as mismatches at positions 5 and 7 and a G:U wobble at position 6 (Figure 4a). The observed complementarity for miR-155 within non-canonical AGO binding sites implied that AGO dCLIP specifically identified sites of direct miR-155 targeting.
We next investigated whether these seed-like motifs were able to mediate in vivo interactions with miR-155 loaded AGO complexes. Previous work showed that a small percentage of HITS-CLIP reads contain deletions, which are likely a consequence of reverse transcription errors at sites of UV-induced protein-RNA cross-linking (Zhang and Darnell, 2011). Because UV-mediated cross-linking was preformed on live cells, deletions within binding sites are evidence of in vivo interactions. We mapped deletions surrounding sites of miR-155 canonical motifs and found a large number of deletions proximal to miR-155 seed matches in libraries generated from WT but not 155KO T cells (Figure 4b). As expected, deletions proximal to miR-21 seeds were found equally in WT and 155KO libraries. To assess frequencies of deletions at non-canonical miR-155-dependent AGO binding sites, we used the seed-like motifs identified by our supervised learning algorithm. We found many deletions in WT but not 155KO libraries near seed-like motifs in non-canonical sites. This result indicated that our sequence alignment strategy effectively identified miR-155 binding within AGO bound regions and demonstrated that the interaction between miR-155 and non-canonical sites occurs in vivo.
Further insights into non-canonical sites came from focused analysis of seed-less miR-155-dependent binding sites. These sites exhibited a strong enrichment for seed region motifs that contained a single mismatch from the miRNA seed (up to p<10-10 for the inexact 7mer from nucleotides 1-7) (Figure 4c). A subtle enrichment (p<0.05) for inexact matches to the 3′ end of the miRNA was also apparent. We examined whether non-canonical sites contained sequences complementary to the 3′ end of the miRNA by focusing on sequences 5′ of motifs with a single mismatch from a miR-155 seed, where supplementary 3′ binding is expected to occur. These sequences showed strong enrichment for complementarity to the 3′ end of miR-155 (p<0.03-10-5 for inexact 6mer and 7mer motifs starting at nucleotides 12-18) (Fig 4d), this was a similar enrichment to that seen for canonical sites (Supplementary Figure 5). These results support similar binding for non-canonical and canonical sites with complementarity to the 5′ end of the miRNA and frequent supplementary pairing to the 3′ end of the miRNA.
Since canonical miRNA target sites are flanked with sequences containing a high AU nucleotide content (Grimson et al., 2007; Nielsen et al., 2007), we examined AU content in non-canonical sites (Figure 4e). Consistent with previous studies, we found that sequences flanking AGO-bound miR-155 seed matches displayed higher AU content than unbound miR-155 seed matches and 3′UTRs. Regions around non-canonical sites had higher AU content than 3′UTR sequences or unbound miR-155 sites and in this regard were indistinguishable from AGO bound canonical miR-155 sites.
Our results showed that miR-155 facilitated AGO binding at sites with inexact complementarity to the miR-155 seed. To assess the effect of these seed-like non-canonical sites on gene regulation, we examined targets with a single miR-155 dependent non-canonical site containing a motif with up to one mismatch from positions 1-6 or 2-7 of the miR-155 seed. To conservatively estimate the effect of these sites on gene expression we removed genes with a miR-155 seed match in the 3′UTR. This set of genes was significantly regulated by miR-155 in activated T cells (p<10-4) (Figure 5a). However, these non-canonical targets were less regulated than genes containing a single canonical site and transcripts were not strongly regulated (>2 fold) by a non-canonical target site alone. The regulation of non-canonical targets was confirmed by comparing gene expression of these predicted non-canonical targets to the expression of genes associated with randomly chosen AGO bound sites; this permutation test was also significant (p<0.05). Together this data suggests that non-canonical targeting alone can lead to modest regulation of gene expression.
To test whether miR-155 directly regulates gene expression through binding to non-canonical sites we evaluated the repression of many predicted non-canonical targets using luciferase reporter assays. We generated luciferase reporters containing target sites in their native 3′UTR context (at least 450 nt of the 3′UTR was used) and examined repression by miR-155 overexpressed in HEK293T cells (Figure 5b). Most non-canonical targets were clearly regulated by miR-155. One 3′UTR (Tgm2) with an inexact seed, but without miR-155 dependent AGO binding, was also tested and showed no miR-155 dependent repression. Consistent with the moderate gene expression changes revealed by our global analysis, the non-canonical sites conferred less repression than the canonical target site in the 3′ UTR of the Socs1 gene.
To test whether miR-155 mediated regulation depends on the non-canonical targets sites predicted by AGO dCLIP, we mutated these sites in a subset of luciferase reporter constructs (Figure 5c and Supplementary Figure 9). In most cases we made mutations at four nucleotides in regions with complementarity to the miRNA seed (Figure 3b). In one case, where we predicted contribution of 3′ interactions to miR-155 binding, we also mutated the corresponding 3′ site (Figure 3b). Mutations of the sites significantly reduced repression by miR-155. These experiments demonstrated that miR-155 binding to inexact seed sites can downregulate gene expression.
In addition to a genetically controlled transcriptome-wide identification of miR-155 targets, the AGO dCLIP dataset enables analysis of target interactions with other miRNAs in activated T cells (see Supplementary Table 1). Thus, we investigated whether our datasets were predictive of canonical and non-canonical gene regulation by other endogenous miRNAs. Like miR-155, the miR-17~92 cluster is oncogenic (He et al., 2005), overexpressed in human malignancies (He et al., 2005; Volinia et al., 2006), and regulates differentiation and survival of T and B cells (Jiang et al., 2011; Ventura et al., 2008; Xiao et al., 2008). We used previously published differential gene expression data from B cell lymphoma cells expressing or lacking the miR-17~92 cluster (Mu et al., 2009). Using this dataset, we improved upon seed based predictions by filtering for miR-17~92 seed matches in AGO binding sites (Figure 5d), demonstrating the utility of our dataset for canonical miRNA target identification for other miRNAs and in other cellular contexts. We also observed, moderate induction of genes with inexact complementarity to the miR-17~92 seed (positions 1-6 or 2-7) and exact complementarity to positions 15-20 within AGO binding sites. Importantly, these results were unaffected by exclusion of genes containing miR-17~92 seed matches anywhere in the 3′UTR (Figure 5d). The success of our analysis of miR-17~92 regulation of canonical and non-canonical sites was particularly remarkable considering that we used AGO binding sites identified in activated T cells and the miR-17~92-dependent gene expression changes were measured in transformed B cells.
Since our dataset provides the most complete and specific list of miRNA targets generated in the immune system, we checked whether targets of individual miRNAs present in activated T cells were enriched for specific immunological functions. Using GO enrichment analysis, we found multiple functions enriched within the targets of miRNAs expressed in our data set (Table 1). Several functions predicted by this analysis have been confirmed; for example, miR-146a is necessary for preventing a T-helper 1 cell-driven autoimmunity (Lu et al., 2010) and miR-17-92 over-expression leads to dysregulated T cell expansion and autoimmunity (Xiao et al., 2008). This analysis would fail unless miRNAs exert their effect through multiple targets. It therefore supports the idea that miRNAs are acting on multiple targets to affect a given function.
Based on GO analysis, which implicated miR-155 in lymphocyte homeostasis, we hypothesized that miR-155 might also contribute to expansion of other T cells. We examined expansion of WT and 155KO T cells co-transferred into lymphopenic mice and found that 155KO T cells were outcompeted by WT counterparts (data not shown). This observation was in agreement with an established role for miR-155 in homeostasis of regulatory T cells, a specialized T cell lineage that expresses high levels of miR-155 (Lu et al., 2009). Our analysis demonstrated the power of high-throughput miRNA target discovery for understanding miRNA dependent phenotypes.
Since we identified both the canonical and non-canonical targets of miR-155, we also explored the relationships between these targets (Supplementary Figure 7). We identified several gene networks in which multiple components are targeted by miR-155. Together, these data support the hypothesis that miRNAs exert biological function through targeting of multiple functionally related genes.
The analysis of miR-155-dependent AGO binding sites in activated primary T cells provides a transcriptome-wide perspective of a miRNA’s targets in an endogenous context. miR-155-AGO complexes are bound to over 300 canonical sites containing a perfect 6-8mer seed in the 3′UTR and this group of genes is strongly regulated by miR-155.
No miR-155 seeds were observed in ~40% of miR-155 dependent AGO binding sites and the majority of these sites contain inexact seed matches, which are associated with weaker regulation. Several rules for non-canonical sites have been described, notably for bulge sites and centered sites (Chi et al., 2012; Shin et al., 2010). Surprisingly neither bulge sites nor centered sites were found among the miR-155 non-canonical sites. While centered sites are potent regulators of gene expression, they are relatively rare, with an average of several functional sites per miRNA (Shin et al., 2010)—so it was not surprising that they were not observed in our data set. In contrast, bulge sites are relatively common, accounting for up to one quarter of all targets for some miRNAs (Chi et al., 2012); however, miR-155 appears to lack binding to this type of site.
Despite the modest level of regulation associated with non-canonical sites, these sites could play important biological roles. First, modest regulation of many targets may lead to important biological consequences—although there are not yet experimental systems for studying the effects of numerous moderate perturbations in gene expression. In addition, combinations of canonical and non-canonical sites may afford a wide spectrum of regulation of gene expression. Second, recent work suggested that miRNA binding sites on endogenous transcripts can compete for miRNA-AGO complexes and thereby upregulate other genes(Cesana et al., 2011; Jeyapalan et al., 2011; Karreth et al., 2011; Lee et al., 2009; Poliseno et al., 2010; Sumazin et al., 2011; Tay et al., 2011). Third, non-canonical sites may serve as an evolutionary mid-point for stronger canonical miRNA targeting. We therefore examined whether orthologous human 3′UTR sequences of our predicted non-canonical targets were enriched for miR-155 seed matches. We found a modest enrichment (P<.05; hypergeometric test) for canonical motifs in human 3′UTRs, suggesting that non-canonical targets may be an evolutionary mid-point. We also examined conservation using alignments of multiple species (Supplementary Figure 8), which again suggested modest conservation of non-canonical sites.
As mentioned above, recent work suggested a model in which RNAs compete for miRNA-AGO complexes and reduce repression of other RNAs bearing sites for the same miRNAs (Cesana et al., 2011; Jeyapalan et al., 2011; Karreth et al., 2011; Lee et al., 2009; Poliseno et al., 2010; Sumazin et al., 2011; Tay et al., 2011). In this study, we found greater than 500 binding sites for miR-155 in a defined T cell lineage. The most bound site, which occurs in a mitochondrial genome encoded RNA, accounts for ~20% of miR-155 dependent AGO binding. The most bound site transcribed from the nuclear genome accounts for less than 5% of miR-155 dependent AGO binding and only 8 sites can individually account for >1% of miR-155 dependent AGO binding. This suggests that at endogenous transcript levels very few transcripts bind a given miRNA-AGO complex at high enough levels to substantially influence the amount of free complex. Although data supporting competing endogenous RNAs focused on transcripts that share multiple miRNA binding sites, it seems unlikely that a transcript which altered the free pool of multiple miRNAs by <1%, would significantly affect regulation of other targets. Therefore, we propose that if the competing endogenous transcript hypothesis is correct, 1) very few endogenous transcripts bind sufficient miRNA-AGO complexes to alter targeting of other transcripts, or 2) there is an as yet undescribed mechanism by which specific subsets of targets compete for restricted pools of miRNA-AGO complexes.
In summary, AGO dCLIP revealed that non-canonical AGO binding sites are a significant component of miRNA targeting, although they generally exert mild effects on gene regulation. We expect differential CLIP-based techniques will facilitate further identification of functionally relevant non-canonical miRNA targets. It appears that similar to variations in transcription factor binding motifs, there is a spectrum of sequence motifs bound by RISC complexes loaded with a given miRNA and that these variations to the binding sites afford a continuum of miRNA-dependent regulation of gene expression.
CD4+ T cells were harvested from WT and 155KO mice described elsewhere (Thai et al., 2007). T cells were activated by culturing in the presence of CD3 and CD28 antibodies for 4 days at 37°C, 5% CO2. 12 HITS-CLIP libraries were constructed for both the activated WT and 155KO CD4+ cells, 6 with each 3′linker sequence, which enabled statistical modeling of linker-induced biases. The libraries were constructed as described by (Chi et al., 2009) with the following modifications. First, AGO immunoprecipitation was performed with a polyclonal antibody generated against an Argonaute 2 N-terminal peptide (O’Carroll et al., 2007). Second, to reduce contamination between libraries, two different 3′ linker sequences were used and reverse transcription (RT) was performed with primers that distinguish between the 3′ linkers. This permitted selective amplification of libraries. Third, the RT primers contained index tags, which enabled multiplex sequencing of libraries in a single sequencing lane. Fourth, to improve library complexity, the entire cDNA pool from each replicate was amplified in the presence of SYBR green dye using a real-time thermocycler so that amplification could be stopped near the end of the linear phase. Finally, random barcodes in the RT primer were used to distinguish between more frequently cloned regions and fragments preferentially amplified by PCR. We constructed libraries with barcodes six nucleotides in length to extend the previous dynamic range of the assay. Sequencing was done on Illumina GA2x as SE36 reads with an 11nt index read to sequence the index and degenerate barcode.
Cells were isolated and cultured as discussed above for 3 days. Cells were resuspended in Trizol, and RNA was isolated according to manufacturer instructions. cDNA libraries were amplified and hybridized to Affymetrix MOE 430A 2.0 chips.
We constructed a 3′UTR database for the mouse genome by identifying the longest annotated 3′UTR for each gene in RefSeq or Ensembl. The annotated coding region sequence (CDS) of each gene was selected from the same transcript. We filtered reads to have >20 base quality scores from the 3′ end and mapped uniquely to the mouse genome allowing up to 1 mismatch. Reads that mapped to the same genomic position and contained the same barcode were collapsed to a single read to reduce PCR bias. We also weighted read counts by normalizing to the library size aligned to the 3′UTR database. mRNA expression data was mean centered and unit normalized.
HEK293T cells were cultured at 7× 104 cells/well in a 24-well plate 1 day prior to transfection. psiCheck-2 vector (Promega) containing 3′UTR regions were cotransfected with miR-155-expressing or control miR-146a-expressing pMDH-PGK-EGFP plasmids with Fugene 6 (Roche) in duplicate. Cells were harvested 18 hours later and luciferase activity was assayed using the Dual-Luciferase Reporter Assay System (Promega). Renilla Luciferase (bearing the cloned 3′UTR) activity was normalized to Firefly Luciferase activity. Results from duplicate wells were averaged and multiple (n≥4) independent experiments were pooled. Differences in repression across wild type constructs was assessed using a one-way ANOVA followed by pairwise tests between the Tgm2 negative control and targets, adjusting for multiple comparisons using Dunnett’s criterion. Differences in repression between mutant and wild type constructs were assessed using paired t tests (Prism, GraphPad Software).
We thank E.B. Conn Gantman and A. Mele for help with HITS-CLIP, C. Lee for assistance with sequencing, A. Tarakhovsky for the AGO2 antibody, and L-F. Lu and A. Arvey for helpful discussions. This work was supported by NIH MSTP grant GM07739 (G.B.L.), and NIH grant R37 AI034206 (A.Y.R.). A.Y.R and R.B.D are investigators with the Howard Hughes Medical Institute.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Accession Numbers # XXXXXXX
Accessible binding maps for all 3′UTRs are located at http://cbio.mskcc.org/leslielab/clipseq