Alternative splicing (AS) has emerged as a major mechanism for regulating gene expression and function in animals, particularly, in mammals. Large-scale studies based on mapping of expressed sequence data on genomic sequences have produced estimates of as many as 30–60% of human genes undergoing alternative splicing [1
]. The impact of alternative splicing on protein function has been studied in great detail and is generally recognized as a source of protein diversity that expands the repertoire of protein function [6
In contrast, little is known about the prevalence and impact of alternative events, such as AS and alternative transcription initiation (ATI), in 5'-untranslated regions (5'UTRs). AS and ATI are the primary sources of 5'UTR transcript diversity, and several reports have conjectured that these mechanisms might play an important role in orchestrating complex regulatory mechanisms within the 5'UTRs [9
]. Estimates of the number of genes with alternative 5'UTRs vary from 12% [14
] to 22% [5
], while estimates of alternative promoter usage range from 10% [15
] to 18% [16
]. Anecdotally, studies have shown that alternative events are responsible for 5'UTR transcript diversity in mammals, but to our knowledge, there have been no detailed, genome-wide studies aimed at elucidating the functional role of transcript diversity in mammalian 5'UTRs.
The bias toward studying AS in the coding regions versus 5'UTR probably reflects two obvious sources of complications in the analysis of transcript diversity in UTRs. First, it is easier to assess the functional impact of AS in protein-coding regions because elimination or disruption of known protein domains resulting from AS is readily interpretable [6
]. Second, boundaries of the coding sequence typically are identified with relative ease because protein sequences are defined by their start and stop codons whereas precise delineation of the 5'UTR often is problematic.
Translational efficiency of eukaryotic mRNAs depends on the presence of regulatory elements within the 5'UTR. It has been shown that the occurrence of initiation codons and open reading frames upstream of the authentic start codon (uAUGs and uORFs, respectively) can affect the translation of mRNA into protein. The presence of uAUGs and uORFs in mammalian 5'UTRs is typically associated with translational repression [10
] but cases of increased translation efficiency also have been reported [18
]. A complementary computational study has demonstrated substantial conservation of uAUGs and uORFs in 5'UTRs of mammalian mRNAs, suggesting that at least some of these elements are functionally important [19
]. These experimental and computational findings raise the possibility that 5'UTR diversity has the potential to produce mRNA isoforms that differ with respect to their uAUG and uORF content, which could be an important facet of the regulation of translation.
Experimental evidence shows that 5'UTR transcript diversity is achieved during transcription, via ATI, and after transcription, via AS. In some instances, both mechanisms are employed. For example, AS in the 5'UTR of human axin2, a negative regulator of Wnt/B-catenin signaling, generates three isoforms with different arrangements of uAUGs and uORFs, resulting in a set of 5'UTRs that each confer different mRNA stabilities and translational efficiencies upon the respective isoforms [20
]. Similarly, an alternatively spliced exon located in the 5'UTR of neuronal nitric-oxide synthase (nNOS) has been shown to introduce a translational control element that inhibits translation of the mRNA [21
]. The diversity and complexity of the mu-opioid receptor gene expression is achieved by a combination of alternative splicing and alternative promoter usage [11
]. Translational repression of the mouse mu-opioid receptor expression using uORFs and leaky scanning has been recently reported [17
The significance of the ATI and AS mechanisms in generating 5'UTR transcript diversity lies with the ability to alter the 5'UTR landscape by rearranging both the number and type of translational control elements included in each transcript. Slight differences in the arrangement of translational control elements between isoforms can lead to major changes in regulatory effects on translation. For instance, translational regulation of the multidrug resistance-associated protein 2 (Mrp2) is mediated by ATI. The 5'UTR of Mrp2 contains four different transcription start sites and three uORFs, and experimental data show that uORF3 has a much stronger inhibitory effect on translation than uORF1 and uORF2 [13
]. A combination of ATI and AS in Dicer, a ribonuclease that mediates RNA interference at the transcriptional and post-transcriptional levels, appears to regulate translational efficiency as well, resulting in long and short transcript variants. Both variants encode uAUGs (9 in the long form and five in the short form), and although both forms show decreased levels of translation, the longer form appears to exhibit greater inhibitory effects, probably, because of the increased number of uAUGs [9
]. Likewise, Tie2, an endothelium-specific receptor tyrosine kinase required for blood vessel maturation, contains multiple transcription start sites and encodes five uORFs. Apparently, the greater the number of uORFs contained in the 5'UTR, the greater the inhibitory effects on translation, suggesting an accumulative effect on the overall level of translational efficiency [12
In addition, it has been shown that both evolutionarily conserved and non-conserved regulatory control elements exhibit inhibitory effects, suggesting that non-conserved control elements may regulate at a species-specific level. For example, human oncogene mdm2 contains two uORFs, both conserved in mouse. Although the inhibitory effect of uORF1 exceeds that of uORF2, both uORFs are required for maximum inhibition of translation [22
]. In contrast, the human stress response transcription factor ATF5 contains two uORFs that are conserved in mouse, as well as three non-conserved species-specific uAUGs. The alternative transcript encoding the conserved uORFs exhibits a stronger inhibitory effect than the transcript encoding the non-conserved uAUGs, but the effect of the non-conserved elements is non-negligible as well [23
We performed a genome-wide comparative study of 5'UTR sequences in primates and rodents with the principal goal of understanding how alternative events impact regulation of translation. To this end, we compared the abundance, conservation and evolution of translational control elements within alternative and constitutive regions of 5'UTR. Alternative nucleotides were not classified according to type of alternative event, because the focus of the study was to examine the broad impact of transcript diversity on mammalian 5'UTRs. Accordingly, we reasoned that the regulatory effects of uAUGs and uORFs located in alternative regions of 5'UTRs should be the same regardless of whether the existing transcript diversity for the given gene is generated via ATI, AS, or both. We find that, although alternative regions of mammalian 5'UTRs evolve faster and are subject to a weaker purifying selection than constitutive regions, they possess extensive potential for translation regulation.