|Home | About | Journals | Submit | Contact Us | Français|
Using a direct miRNA cloning strategy we previously identified fourteen marsupial- or species-specific microRNAs in the marsupial species Monodelphis domestica. In the present study we examined each of the pre-miRNAs and their flanking sequences and demonstrate that half of these miRNAs evolved from marsupial-specific transposable elements. These findings reinforce the view that transposable elements are a previously unappreciated source of new, lineage-specific microRNAs.
MicroRNAs (miRNAs) are a class of small (21–24nt) regulatory RNAs whose impact on both comparative and functional genomics is only just being felt. First discovered in 1993 (Lee et al., 1993; Wightman et al., 1993), the number of miRNAs known in animal, plant, and viral genomes has grown into the thousands (Griffiths-Jones et al., 2008) and knowledge regarding their roles in regulating cellular processes, including virtually all aspects of development, differentiation, and, even cell death, continues to expand rapidly (Bartel, 2009; Kim et al., 2009). However, many questions regarding the origin and evolution of miRNAs remain unanswered. For example, while it is becoming clear that miRNAs arise from a variety of sources, including inverted duplications (Voinnet, 2004; Piriyapongsa and Jordan, 2007), pseudogenes (Devor, 2006), and transposable elements (TEs) (Piriyapongsa et al., 2007), the relative importance of these sources in generating new miRNA loci remains unclear. In this paper we focus on data suggesting that the latter of these sources may be more important than previously recognized.
Several studies (e.g., Smalheiser and Torvik, 2005, 2006; Borchert et al., 2006; Piriyapongsa et al., 2007; Piriyapongsa and Jordan, 2008) have shown that miRNAs can and do evolve from TEs and it has been suggested that TEs are an unappreciated source of new, lineage-specific miRNAs (Lanier et al., in preparation). We carried out an extensive in vitro and in silico survey of the miRNAome of the marsupial mammal Monodelphis domestica (Devor and Samollow, 2008; Devor and Peek, 2008). In addition to cloning and mapping 174 miRNAs that are conserved throughout the Mammalia, we identified fourteen miRNAs that are unique to marsupials including three miRNAs that are unique to M. domestica. Here, we show that half of these lineage-specific miRNAs contain TE signatures and that these signatures are themselves specific to marsupials. Thus, we provide further evidence supporting the view that TEs are a source of lineage-specific miRNAs.
The initial in vitro and in silico survey of M. domestica miRNAs was carried out by direct miRNA cloning from six tissues (brain, heart, lung, liver, kidney, and testes) using the miRCat™ small RNA cloning method (Integrated DNA Technologies) coupled with queries of the completed M. domestica genome assembly MonDom5 (Ensembl; Mikkelsen et al., 2007). Marsupial-specific miRNAs were all found by in vitro cloning and were validated using MonDom5 to obtain sequences flanking the mature miRNA, the RNA folding package mFOLD (Zuker, 2003) to confirm the presence of a thermodynamically stable pre-miRNA hairpin structure, and miRBase (Griffiths-Jones et al., 2008) to confirm that the pre-miRNA had not been seen in any other species.
The search for TE signatures in these marsupial-specific miRNAs was carried out by submitting the pre-miRNAs, plus 3 kb of upstream and 3 kb of downstream flanking sequence obtained from the MonDom5 genome assembly, to both RepeatMasker (Smit, et al., 1996–2004) and CENSOR (Jurka et al., 2005; Kohany et al., 2006). The TE identifications provided by these two sources were then used to annotate the immediate genomic region around each of the miRNAs. Very short nucleotide runs identified in an miRNA precursor as matching a TE fragment would not be counted as a legitimate TE signature. However, no such short fragments were seen as the shortest miRNA precursor sequence that was masked accounted for more than 40% of one precursor.
Descriptions of the fourteen marsupial-specific miRNAs are provided in Table 1. Twelve of these fourteen miRNAs have been assigned miRBase identification numbers while two await assignment. Eleven of the fourteen miRNAs map to intergenic regions of the M. domestica genome while two others, Mdo-miR-1547 and Mdo-miR-1548, map to introns of predicted protein coding genes. The unassigned Mdo-301 microRNA is located in the 5′ UTR of the TCF12 transcription factor locus. Eleven of the fourteen miRNAs have been identified by us in the genome of the tammar wallaby (Macropus eugenii) whereas three, Mdo-miR-1541, Mdo-miR-1544, and Mdo-miR-1545, have so far been found only in the M. domestica genome (Table 1). Importantly, seven of the fourteen miRNAs are completely masked (Mdo-miR-1542-1, Mdo-miR-1542-2, Mdo-miR-1544, Mdo-miR-1545, and Mdo-miR-340) or partially masked (Mdo-miR-1546 and Mdo-miR-1547) by marsupial-specific TE sequence signatures (Table 1). Six of these TE sequence signatures were identified as LINEs and one was identified as a Mariner DNA transposon.
The three microRNAs that appear so far to be unique to M. domestica represent extremes of TE annotations. Mdo-miR-1541 is located in the q-terminal region of chromosome 4 and lies in a sequence extending nearly 2 kb in each direction for which there is no evidence of TE signatures. In contrast, Mdo-miR-1544 and Mdo-miR-1545 are members of a cluster of 39 microRNAs spanning a 100 kb region of the X-chromosome that we have previously shown to form two distinct clades descended from a single common ancestor sequence via a series of duplication events (Fig. 1A and Devor and Samollow, 2008). When the entire 103,000 bp region was searched for TE signatures, nearly half (18 of 39) of the precursors were completely or partially masked including the two cloned members that we have designated Mdo-miR-1544a and Mdo-miR-1545a (Fig. B). All 39 members of this cluster, whether masked or 1 not, are flanked upstream by the same 5′ UTR remnants of a marsupial L1 TE and downstream by another marsupial L1 5′ UTR remnant in the opposite orientation. Similar juxtapositions of LINEs and other transposed elements have been shown to produce microRNAs in primate genomes (Smalheiser and Torvik, 2005). The common ancestor of both the Mdo-miR-1544 and Mdo-miR-1545 clades thus appears to have evolved from the opposition of two marsupial L1 transposed elements and the entire cluster then diversified via a series of duplications with many of the individual members still retaining sufficient ancestral sequence identity to be masked.
A second M. domestica locus in which one microRNA is a duplicate of another is the Mdo-miR-1542-1/Mdo-miR-1542-2 pair located on chromosome 8. Annotation of the sequence containing these two miRNAs showed a tandem arrangement of six copies of a fragment of the 5′ UTR of a marsupial L2 transposed element. The two microRNAs, separated by 124 bp, are found completely within this set of tandem duplications (Fig. 2). Various alignments of the L2 remnant duplications indicate that miR-1542-2 evolved first from TE fragments and that miR-1542-1 and its immediate flanking region arose by a subsequent tandem duplication event. The two miRNAs have the same mature sequence but differ at nine of the remaining 68 positions of the precursor resulting in slightly different hairpin structures (Fig. 2). This locus is almost completely conserved in the M. eugenii (tammar wallaby) genome. There is a single-base difference between the two species in miR-1542-1 and no differences in miR-1542-2. The immediate flanking and intervening masked sequences are 83% identical.
The 3′ halves of two miRNAs, Mdo-miR-1546, located on chromosome 3, and Mdo-miR-1547, located on chromosome 4, were each found to be masked by our analysis. The last 38 bases (42%) of its 89 base precursor of Mdo-miR-1546 were masked as a sequence identified as the 3′ UTR of a marsupial L2 transposed element. For Mdo-miR-1547, the last 49 bases (54%) of its 91 base precursor were masked as a sequence identified as the 3′ end of marsupial RTE1 LINE. The similarity value for both of the masked sequences is ~0.75 and the transposed sequences all continue well past the 3′ ends of the precursors. Both microRNAs are conserved between the genomes of M. domestica and the tammar wallaby with two base differences seen between Mdo-miR-1546 and Meu-miR-1546 and between Mdo-miR-1547 and Meu-miR-1547. In none of these four cases does the single-base difference occur in a region of the precursor crucial for target specificity. Two of the substitutions lie near the termini of the stems, one substitution lies within the miR-1547 loop, and another lies at position 19 of the mature miR-1546 sequence.
The final masked M. domestica microRNA is Mdo-miR-340. Initially thought by us to be marsupial-specific, this microRNA is now assigned to the miR-340 family which is found throughout the eutherian (placental) mammals on the basis of precursor and mature sequence similarities that were not immediately obvious upon initial examination. Annotation of Mdo-miR-340 indicates that its origin is in the pairing of two nearly identical remnants of a Mariner DNA transposon lying in opposite orientations (Fig. 3A). This juxtaposition, which could have resulted from an inverted duplication, yields a highly stable hairpin (ΔG = −41.1 kcal). The tammar wallaby ortholog shows only 86.6% identity with the M. domestica ortholog over the full length of the precursor and a substantially lower hairpin stability (ΔG = −31.9 kcal); nevertheless the mature sequence differs only in position 19 of 22. Interestingly, when we used miR-340 precursor sequences from human, chimpanzee, rhesus macaque, mouse, rat, cow, and dog, archived in miRBase (Release 12.0), to screen GenBank TRACE Archives for additional marsupial and placental sequences we did not obtain any additional marsupial examples, but we did identify miR-340 precursor sequences for a further seventeen placental mammal species including a perissodactyl (horse), a cetacean (dolphin), two chiroptera (brown bat and fruit bat), two lagomorphs (rabbit and pika), armadillo, and tenrec. An alignment of miR-340 precursors of these species with opossum and tammar wallaby is shown in Fig. 3B. Precursor sequences among the placental mammal species show the effects of purifying selection expected for microRNAs while the two marsupials are not only quite dissimilar to each other but even more so to the placental mammals. This raises the question as to whether or not marsupial miR-340 actually is orthologous to placental miR-340. Three observations bear directly on this question. First, the mature miRNA cloned from M. domestica is different from that cloned in the placental mammals. Recently, Wheeler et al. (2009) discussed a phylogenetic miRNA phenomenon they term “seed shift” in which the mature sequence is seen to shift 1–2nt in either the 5′ or 3′ direction thereby changing the seed at positions 2–8. These seed shifts are phylogenetically conserved between two or more taxa. The five base shift we observe in miR-340 would clearly be an extreme example of this phenomenon. Second, the 3′ stem sequence in placental miR-340 precursors retains insufficient similarity to the Mariner transposed element seen in the marsupial sequences to be identified as such in any of the twenty-four species in which we have found it. Sequence differences of this magnitude have not been observed in any of the other microRNAs conserved among the Mammalia (Devor and Peek, 2008). Third, the context of miR-340 in the M. domestica genome is different from that in the placental mammals. Among the placental mammals, miR-340 lies in intron 2 of the ring finger protein gene RNF130. In the M. domestica genome, miR-340 maps near the p-terminal end of chromosome 1 whereas RNF130 maps near the middle of the q-arm more than 300 Mb distant. Moreover, the region in which Mdo-miR-340 is found is not syntenic with the location of RNF130 on human chromosome 5 but M. domestica RNF130 is. No other conserved miRNA in the M. domestica genome lacks synteny with its ortholog in the human genome (Devor and Samollow, 2008). The most parsimonious explanation for the phylogenetic divergence pattern seen here for miR-340 would be that all miR-340s are descended from a common ancestor and that, following metatherian–eutherian divergence, there was rapid divergence of the mature sequence resulting in changes in the fold back sequence that were subsequently preserved under purifying selection. However, the observations presented here appear to favor a less parsimonious hypothesis. The data presented suggest that the miRNAs identified as miR-340 in the metatherian and eutherian lineages are in fact different microRNAs that coincidently evolved from the same sequence of the same TE. Thus, Mdo-miR-340 is indeed a lineage-specific miRNA.
Smalheiser and Torvik (2005, 2006), Borchert et al. (2006), Piriyapongsa et al. (2007), and Lanier et al. (in preparation) have all presented evidence that micoRNAs can evolve from sequences of transposed elements. Moreover, both Piriyapongsa et al. (2007) and Lanier et al. (in preparation) suggest that transposed elements are an unappreciated source of lineage-specific microRNAs. We have presented evidence herein that half of the lineage-specific microRNAs that have been discovered in the marsupial species M. domestica show evidence of lineage-specific TE origin. This evidence further strengthens the position that TEs are indeed fertile ground for the emergence of new microRNAs.
Portions of this paper were presented at the 2nd International Conference and Workshop “Genomic Impact of Eukaryotic Transposable Elements” Asilomar, California February 6–9, 2009. Funding for this work was provided in part by the National Institutes of Health Grant RR014214 to PBS.