|Home | About | Journals | Submit | Contact Us | Français|
Mammalian genomes encode numerous natural antisense transcripts, but the function of these transcripts is not well understood. Functional validation studies indicate that antisense transcripts are not a uniform regulatory RNA group, but instead belong to multiple categories with some common features. Recent evidence indicates that antisense transcripts are frequently functional and use diverse transcriptional and post-transcriptional gene regulatory mechanisms to carry out a wide variety of biological roles.
Natural antisense transcripts are RNA molecules that are transcribed from the opposite DNA strand compared with other transcripts and overlap in part with sense RNA. Both sense and antisense RNAs can encode proteins or be non-protein-coding transcripts; however, the most prominent form of antisense transcription in the mammalian genome is a non-protein-coding antisense RNA partner of a protein-coding gene1. The presence of non-protein-coding sense and antisense transcript pairs implies that natural antisense transcripts may also regulate non-protein-coding sense RNAs.
Various techniques have been used to identify natural antisense transcripts, including large-scale sequencing of cDNA clones1, tiling arrays2, bioinformatics analysis of RefSeq and EST databases3, hybridization techniques4, SAGE libraries5, strand-specific microarrays6 and, most recently, asymmetric strand-specific analysis of gene expression (ASSAGE)7 and global run-on sequencing (GRO-seq)8. These studies have demonstrated widespread antisense transcription in mammalian genomes. The mammalian transcriptomes are still not sequenced deeply enough to provide the organization of all low copy number antisense transcripts. Nevertheless, the majority of transcription units (TUs; see Glossary box) appear to contain antisense transcripts1. Moreover, antisense transcripts have been documented for many active promoters8-10. Therefore, the presence of antisense transcripts is no longer a curiosity but rather a pervasive feature of mammalian genomes. For instance, the FANTOM-3 mouse transcriptome sequencing consortium identified natural antisense transcripts for more than 70% of TUs, the majority of which represent non-protein-coding RNA 1.
Antisense transcripts are not evenly distributed across the genome. Both ends of protein-coding genes have a propensity for natural antisense transcription7,11; specifically, antisense transcription is enriched 250 nucleotides upstream of the transcription start site (TSS)8,9 and 1.5 kilobases downstream of sense genes8,12. The basal expression levels of sense and antisense transcripts in different tissues and cell lines may be either positively or negatively correlated1,13. Moreover, antisense transcript expression in different human cell lines is not always linked to expression of the sense gene, which suggests the use of alternative transcriptional regulatory elements7. Antisense RNAs have a tendency to undergo fewer splicing events and typically show lower abundance compared with sense transcripts7. Interestingly, knockdown or blockade of endogenous antisense transcripts can have multiple outcomes, with the corresponding sense transcript concentration showing either an increase (discordant regulation) or a decrease (concordant regulation)14. It has been proposed that discordant de-repression of sense transcript expression, resulting in upregulation of sense RNA expression, can be achieved by removal or steric blockade of many but not all antisense transcripts1,14. These and other variable intrinsic properties imply that antisense-mediated regulation of gene expression must operate through a variety of mechanisms, and further suggest that antisense transcripts are a heterogeneous group of regulatory RNAs.
Supplementary information S1 (table) contains all reported functional antisense transcripts in mammalian genomes characterized to date. This growing list of validated sense–antisense transcript pairs includes many important developmental genes, as well as genes known to be involved in a variety of human disorders. Natural antisense transcripts must therefore be assumed to exert positive or negative regulation at different levels of gene expression invoking diverse and largely uncharacterized co-regulatory factors. Hence, potential involvement of natural antisense transcripts should be considered in a wide range of future studies on the regulation of gene expression.
There are a number of proposed mechanisms for antisense-mediated regulation of sense mRNA. Here, we categorize the mechanisms into four main groups: mechanisms related to transcription, RNA–DNA interactions, RNA–RNA interactions in the nucleus and RNA–RNA interactions in the cytoplasm. Each type of mechanism will be discussed in detail below.
According to this mechanism, the act of transcription in the antisense direction, but not the antisense RNA molecule itself, modulates transcription of sense RNA. Transcription in the antisense direction is suggested to cause alterations in sense RNA, as in the case of transcriptional collision, and is also proposed to be involved in genomic rearrangements (Figure 1).
The transcriptional collision model is based on the assumption that RNA polymerases bind to the promoters of convergent genes on opposing strands of DNA and then proceed towards the 3′ end of sense and antisense genes. RNA polymerase complexes collide in the overlapping region, blocking further transcription15 (Figure 1a). Collision of RNA polymerases has been observed in Escherichia coli, using atomic force microscopy16. Transcriptional collision has also been observed in Saccharomyces cerevisiae17, where transcription from the antisense direction halts transcription of the sense mRNA. A bioinformatics study found that, in humans and mice, the length of overlapping regions between sense and antisense transcripts is inversely correlated to the expression levels of antisense transcript. This finding may suggest a possible clash of RNA polymerases, pertaining to the point that the likelihood of collision increases as the length of overlapping region increases18. In this model, the antisense RNA per se is not involved in interfering with transcription of the sense RNA: instead, it is transcription in the antisense direction that decreases synthesis of sense RNA.
Transcriptional collision may affect a subset of mammalian cis-encoded natural antisense transcripts, but it is apparently not the predominant route of antisense-mediated gene regulation. Transcription of the sense and antisense RNAs may occur at different times at the same chromosome locus, or sense and antisense RNA may be transcribed simultaneously from paternal and maternal chromosomes. Allelic-specific transcription may explain why the X chromosome shows a significantly lower degree of antisense transcription compared with other chromosomes1.
Immunoglobulin production in B lymphocytes and receptor selection by T lymphocytes depend on transcription from hypervariable regions. To generate variability, T and B lymphocytes need a recombination process that occurs through hypermutation in the variable regions of immunoglobulin and T cell receptor genes. Activation-induced cytidine deaminase (AID), which deaminates deoxycytidine to deoxyuridine in single-stranded DNA (ssDNA), is required for both somatic hypermutation (SHM) and class switch recombination (CSR), two processes that are involved in secondary immune response. DNA must locally uncoil or melt, and become single-stranded for DNA recombination, replication and transcription to occur. The transcription complex consists of an RNA polymerase molecule enclosing approximately 17 +/- 1 melted bases (the transcription bubble) of the template DNA19; however, the size of transcription bubble may vary in different instances. The AID enzyme optimally deaminates a small ssDNA bubble, in vitro, consisting of ~5 nucleotides with a ssDNA motif, called WRC20. It has been hypothesized that antisense transcription in the variable region could make the ssDNA accessible for AID21,22 (Figure 1b).
Antisense transcription is frequently observed in the variable23,24, but not in the constant, regions and it may be initiated in preparation for SHM and CSR (reviewed in Ref. 25). According to this model, antisense transcription is proposed to open the transcription bubble and expose the ssDNA, which is a substrate for AID23,24, or antisense transcription may be involved in remodelling of the chromatin structure to make the DNA sequence accessible for recombination22-24. In both of these scenarios, transcription from the antisense direction, but not the antisense RNA molecule, is important for genomic rearrangement.
Natural antisense transcripts may also be involved in epigenetic regulation of transcription, through DNA methylation26, chromatin modifications27 and monoallelic expression (for example, genomic imprinting28, X-chromosome inactivation29 and random monoallelic exclusion of the autosomal loci). This gene regulatory model is based on direct or indirect RNA–DNA and RNA–chromatin interactions30. Antisense transcripts may bind to the corresponding DNA strand, resulting in either DNA methylation (Figure-2a) or providing a scaffold for histone-modifying enzyme (HME) recruitment and subsequent alterations of the chromatin status (Figure-2b).
Some of the antisense-mediated epigenetic changes are reported to be independent of Dicer, arguing against a role for small RNA mediators27. Therefore, antisense RNA processing to small RNA, which may be mediated by Dicer in the cytoplasm, is not necessarily required for epigenetic modifications. Instead, antisense transcripts accumulate locally and trigger DNA or chromatin modifications31. These modifications then expand to neighbouring regions, even though these adjacent regions do not exhibit complementarities to the original antisense transcript32. The secondary expansion of the modifications may be restricted to the promoter or enhancer of a single gene (random monoallelic exclusion), or may include a cluster of genes, such as in the case of genomic imprinting of the Kcnq1 imprinted locus28. Finally, this expansion process occasionally involves an entire chromosome, such as X chromosome inactivation in females.
Natural antisense transcripts have been proposed to cause DNA methylation26, DNA demethylation33 and chromatin modifications of non-imprinted autosomal loci27,34. Suppression of transcription is usually caused by DNA and chromatin modification at the promoter region of the sense strand (Supplementary information S2 (box)). For example, an antisense RNA for the α-globulin 2 gene (HBA2) can induce DNA methylation, leading to silencing of the α-globulin 2 gene26.
Antisense-mediated transcriptional silencing also affects the p1527, p2134 and progesterone receptor (PR)35 genes, through DNA methylation and heterochromatin formation. Suppression of sense transcription is often induced by trimethylation of histone H3 lysine 27 (H3K27me3) at the sense promoter region. For example, an antisense transcript of the tumour suppressor gene p21 recruits a regulatory complex that induces H3K27me3 and suppression of the sense promoter region34. Considering the presence of RNA-binding motifs in some chromatin-modifying enzymes30, one could postulate that RNA transcripts are local modulators of chromatin structure. This proposed mechanism might also explain functionality even if the abundance of antisense RNA molecule is low9,10,34. Unlike RNAs that can be present in cells in many copies, there are two copies of DNA for any given gene in cells; therefore, only two molecules of antisense RNA per cell are sufficient to bind to the corresponding DNA strand and to exert a regulatory function. These proposed local effects of antisense transcripts might sometimes extend to the neighbouring regions, such as in the case of the imprinted locus28 (see below). These examples suggest the possibility of local and extended antisense RNA-induced DNA and chromatin modifications.
Overlapping transcription of small (<50 nucleotides) RNAs 36,37 as well as promoter-associated small RNAs (PASRs)36,37, termini-associated small RNAs (TASRs)36,37 and promoter upstream transcripts (PROMPTs)10 have been documented using strand-specific genomic tilling arrays in the ENCODE region of the human genome36,37. Transcription start site-associated RNAs (TSSa-RNAs) (20-90 nucleotides) have also been identified using deep sequencing techniques8,9. The enrichment of antisense transcripts in both promoter and terminal regions has been confirmed by the unbiased ASSAGE technique7. TSSa-RNAs are flanking active promoters in both the sense and antisense direction with regard to downstream protein-coding genes8,9. Divergent promoter activity, which is believed to produce this small RNA group, is documented for more than half of mouse and human genes8,9. These well-documented promoter activities, which generate promoter-directed sense and antisense transcripts, challenge the canonical ‘gene’ definition and how genes and their regulatory elements are arranged in mammalian genomes. Mammalian transcriptomes appear to contain an abundance of nested and overlapping gene structures, giving rise to both coding and noncoding transcripts. In addition, it seems that the RNA regulatory elements that control the expression of a gene are frequently distributed within or beyond other genes in both directions. Collectively, pervasive divergent promoter activity suggests a lack of definite 5′ and 3′ boundaries for the transcribed genes.
Interestingly, these small antisense transcripts do not correspond to annotated natural antisense transcripts36,37. There is also no evidence for double-stranded RNA (dsRNA) or hairpin RNA precursors that might represent intermediates in the biogenesis of such small natural antisense transcripts36,37. Generation of TSSa-RNA is also shown to be independent of Dicer, suggesting that the possible regulatory function of these transcripts is not mediated through common RNA interference (RNAi), such as RISC and RITS pathways. Additionally, the expression of promoter-directed antisense transcripts may be associated with the presence of abortive RNA transcripts38.
The transcribed regions of PROMPTs and TSSa-RNAs overlap with paused RNA polymerase II (RNAPII) and active chromatin marks, such as trimethylation of histone H3 lysine 4 (H3K4me3), suggesting a role for local RNA accumulation in maintaining the dynamic chromatin state required for promoter activity9,10. The abundance of short RNA transcripts at the transcription initiation and termination sites8,9 as well as transcription of unstable transcripts upstream of genes10, which are usually co-localized with particular chromatin marks, imply involvement of these local and transient RNA transcripts in the regulation of the sense gene expression and possible participation in chromatin remodelling. We hypothesize that the frequent presence of these RNA classes in or around the promoter region of actively transcribed genes, in a very low concentration, suggests the possibility of local RNA-induced chromatin modifications.
Imprinted genes are genes for which only one allele, maternal or paternal, is actively transcribed. Natural antisense transcripts are often associated with imprinted genes, with a frequency up to 81% in one study1 (Supplementary information S3 (box)). More than 160 imprinted genes have been identified so far in humans and mice, most of which are organized into clusters. Some imprinted genes, including insulin-like growth-factor type-2 receptor (IGFER)39 and potassium voltage-gated channel, KQT-like (Kcnq1) imprinting control region31, exhibit guided chromatin and DNA modification by antisense RNA that expands to include neighbouring genes. These effects are not mediated through the RNAi pathway40. Instead, antisense RNA appears to recruit repressor complexes that modify chromatin into an inactive state. Suppressive chromatin modifications spread in both directions to neighbouring genes, similar to X chromosome inactivation (below) but with limited penetrance31.
X chromosome inactivation is required for balanced expression of the genes on the X chromosome in female mammals. This dosage compensation occurs through heterochromatin formation along the X chromosome to be inactivated. Two long non-protein-coding genes are transcribed from the X chromosome inactivation centre, XIST (X-inactive specific transcript) and TSIX (X [inactive]-specific transcript, antisense). These two sense and antisense transcripts control the silencing of the X chromosome29. The X chromosome inactivation centre is necessary and sufficient for X chromosome inactivation. Elements of the mechanism have been demonstrated by showing that Tsix silences Xist through modification of the chromatin structure in the Xist promoter region. Premature termination of Tsix transcription abolishes the repressive chromatin configuration at the Xist promoter29.
The third type of mechanism for antisense transcript-mediated gene regulation is based on the predicted nuclear sense–antisense RNA duplex formation (Figure 3Aa). The duplex RNA between sense and antisense transcripts may result in several outcomes, all of which modulate sense mRNA expression. Nuclear RNA duplex may produce mRNA transcripts with alternative splicing. Additionally, hybrid nuclear RNA is a substrate for editing enzymes that can alter the localization, transport and stability of the sense mRNA transcript.
Antisense RNA may also bind to the sense RNA, masking the splice sites and thereby changing the balances between splice variants (Figure 3Ab). Thyroid hormone receptor alpha gene (TRα) is an example where the antisense transcript RevErbAα influences splicing of TRα1 and TRα2 mRNAs41. Using a similar arrangement, antisense transcripts can potentially cause alternative polyadenylation and termination of transcription.
Natural antisense transcripts may modulate mRNA nuclear transport by a mechanism that involves duplex formation between sense and antisense RNAs. Nuclear retention of the antisense RNA is commonly observed and may account for some antisense RNA-mediated regulation. Some cellular stressors, such as hypoxia, serum starvation and hydrogen peroxide, can change the nuclear retention pattern of antisense transcripts, thereby altering the levels of their sense partners42. Direct interactions with nuclear proteins or other nuclear RNAs likely cause nuclear retention of antisense transcripts, although the precise mechanism for each case has not yet been discovered.
Natural antisense transcripts have also been linked to mRNA editing (Figure 3Ac). Interaction between the Drosophila melanogaster 4f-rnp gene and its antisense transcript, sas-10, is reported to induce adenosine-to-inosine (A-to-I) editing in the overlapping region of 4f-rnp mRNA43. A-to-I RNA editing is induced by dsRNA formation, resulting in the recruitment of the enzyme ADAR (adenosine deaminases that act on RNA), which deaminates the targeted adenosine to inosine44.
In the fourth proposed mechanism of sense–antisense interference, a duplex forms between sense and antisense RNA in the cytoplasm (Figure 3Ba–c). Cytoplasmic RNA hairpins may affect sense mRNA stability or translation. The sense–antisense RNA duplex may also, in theory, cover microRNA (miRNA)-binding sites or serve as a hairpin template for generating endogenous small interfering RNA (siRNA).
Cytoplasmic sense–antisense duplex formation can alter sense mRNA stability and translation efficiency. The overlapping region may affect mRNA stability by reducing mRNA decay, whereby mRNA undergoes endo- or exonucleolytic degradation by various RNases. Indeed, we have recently demonstrated that an antisense BACE1 transcript (BACE1-AS) increases the stability of BACE1 mRNA through a mechanism involving RNA duplex formation. We hypothesize that transient RNA duplex formation may alter the secondary or tertiary structure of BACE1 and thereby increase its stability42.
Antisense transcripts for inducible nitric oxide synthase (iNOS), an important gene in inflammatory diseases, increase the stability of iNOS mRNA45. Enhancement of iNOS mRNA stability is mediated through interactions of antisense RNA molecule with the AU-rich element (ARE)-binding HuR protein. The HuR protein, in turn, may suppress RNA degradation by inhibiting deadenylase or exonuclease enzymes45. Alterations in the secondary structure of HIF-1α sense transcript by an antisense RNA may expose the ARE and reduce the stability of the transcript by making this RNA prone to degradation46.
Translational inhibition is yet another proposed function for some naturally occurring antisense, as in the case of B cell maturation antigen (BCMA) transcript, where over-expression of the antisense RNA has been reported to reduce sense protein, but not sense mRNA, levels47. Another well-documented case of translational inhibition is the antisense for PU.1 mRNA. PU.1 mRNA translation is inhibited by a noncoding antisense transcript48, which is a polyadenylated RNA with a lower concentration but a longer half-life time than the sense PU.1 transcript. Processed antisense RNA in the cytoplasm may bind to the sense transcript and stall translation between initiation and elongation steps48. It also seems likely that formation of duplexes may alter ribosome entry, and consequently the protein output of the target.
We propose that many natural antisense transcripts may have the ability to cover miRNA-binding sites following cytosolic RNA duplex formation. This appears to be the case for an antisense BACE1 transcript (BACE1-AS) in addition to its ability to increase the stability of BACE1 mRNA42. Thus, one of the regulatory functions of antisense transcripts may be to ‘mask’ the miRNA-binding site on the sense mRNA. The 3′UTR mRNAs are frequently shown to contain target sites for miRNAs, and at least 34% of natural antisense transcripts in the FANTOM3 dataset showed tail-to-tail format with a 3′UTR overlapping region.
The presence of endogenous processing machinery for exogenous siRNA, which mediate sequence-specific knockdown of targeted genes, implies that endogenous siRNA should exist. Endogenous siRNAs derived from a natural antisense transcripts were observed in Arabidopsis thaliana, where they regulate salt tolerance49. Plant endogenous siRNA are derived from sense–antisense RNA duplex formation of several genes: for example, the Sho gene antisense transcript50, SRLK/AtRAP antisense transcript51 and 64% of protein-coding natural antisense transcripts in A. thaliana are reported to generate endogenous siRNAs52.
Endogenous siRNAs derived from transposable elements and pseudogenes have also been identified in mouse oocytes and cultured human cells53-57. Endogenous siRNAs originating from mRNAs and their corresponding antisense were also recently identified in mouse oocytes54 and human HepG2 liver carcinoma cells55. Both Piwi-interacting RNAs (piRNAs)58,59 and siRNAs originating from mRNAs were found in mouse oocytes54. Therefore, formation of endogenous siRNA from naturally occurring antisense transcripts appears to occur in mammalian cells.
Transposable elements, inverted repeat structures, sense–antisense genes (cis-natural antisense transcript) and antisense transcripts from remote loci (trans-natural antisense transcripts) have also been recognized as sources of dsRNAs and subsequent, Dicer-dependent, endogenous siRNA production54,55. One example includes the overlapping transcripts in the kinesin family member 4A (KIF4A) and PDZ domain-containing 11 (Pdzd11) locus that generates endogenous siRNA derived from its cis-antisense transcript. Importantly, almost all of the endogenous siRNAs (117 unique sequences) are derived from the overlapping region of the sense and antisense RNA, suggesting that these endogenous siRNAs are produced from an intermolecular dsRNA formed between the oppositely oriented transcripts. In Dicer mutants, levels of the siRNAs derived from the Pdzd11/Kif4 locus are decreased, and Pdzd11 and Kif4 mRNA levels are increased, suggesting that endogenous siRNA production is cytoplasmic and Dicer dependent, and that expression of both sense and antisense RNAs is regulated by the RNAi pathway54. Therefore, endogenous siRNAs may regulate both sense and antisense transcript levels.
Co-expression of natural antisense transcripts with their sense counterparts, as well as frequently observed concordant regulation of sense and antisense RNAs in many tissues and cell lines, argues against endogenous siRNA being the sole mechanism of antisense-mediated regulation of gene expression. Additionally, many co-expressed sense-antisense transcripts in D. melanogaster S2 cells do not generate endogenous siRNAs60,61. It is unclear how the majority of co-expressed sense and antisense RNAs escape the endogenous siRNA formation pathway 63. It is also not clear whether there is active selection for entry into the RNAi pathway and endogenous siRNA formation.
We have presented examples of functional natural antisense transcripts in order to show the multilayered involvement of these molecules in regulating gene expression. It can be concluded that antisense transcripts are not a uniform group of regulatory elements, but that they display some common features, including sequence complementarities to conventional sense genes. We have summarized the proposed mechanisms of antisense transcript actions into four models, each corresponding to a putative regulatory mechanism for certain antisense transcripts.
Although we have provided examples for each model, it is important to note that not all four models have equally strong experimental support. For instance, the frequently observed convergent promoter activity and engagement of multiple RNAPII complexes in the opposite orientation of active promoters would argue against the transcriptional collision model as a widely used mode of action. On the other hand, nuclear and cytoplasmic RNA–RNA interactions with various outcomes, such as RNA editing, splicing and dicing, have been observed for several functionally validated antisense transcripts, but it is not a predominant basis of natural antisense regulation. Furthermore, antisense RNA-induced alteration in sense promoter DNA methylation has only a few documented examples and, considering the static nature of DNA methylation, we postulate that it might be operational only in certain early developmental stages. Conversely, antisense RNA-induced chromatin remodelling seems to be a more feasible and dynamic mode of action for many low-copy natural antisense transcripts. In the latter case, antisense RNA might predominantly act locally to maintain or modify chromatin structure and ultimately to activate or suppress sense gene expression. Natural antisense transcripts are involved in different gene regulatory pathways, but it is still not clear which intrinsic properties of natural antisense RNA molecules, or extrinsic features such as protein interactions, cellular and developmental context, are decisive for any given pathway.
It is becoming clear that the expression of natural antisense transcripts and other non-protein-coding RNAs are pervasive in the human genome, although the mechanisms of their production and preferential sites of action are poorly understood. Particularly, the exact molecular machineries behind the antisense RNA-induced chromatin modifications are not currently known. It will be important to learn which histone-modifying enzymes might be involved, how they interact with antisense RNAs and which histone modifications they induce over time. It will also be of interest to define the exact contexts in which perturbation of the same antisense transcript35 results in either activation or repression of sense RNA expression. It is remarkable that these fundamental and powerful regulatory mechanisms relating to natural antisense transcription remain to be elucidated in large part. It can be safely assumed that examination of natural antisense phenomena will remain a rich field of investigation for the foreseeable future.
We are grateful to M. P. van der Brug, B. H. Miller, J. Silva and S. Brothers for insightful comments and careful reading of the manuscript. Discussions with Y. Hayashizaki, G. St-Laurent and other colleagues within the FANTOM Transcriptomics consortium have also been highly valuable to us.