For more than 30 years, sporadic reports have described the presence of circular mRNA transcripts in mammals. Many of these circular RNAs were discovered serendipitously, and were largely disregarded as nonspecific byproducts when they were found to be expressed at low levels. In contrast to this prevailing view on the abundance of circular RNA isoforms, we found strong evidence that circular isoforms of hundreds of human transcripts, with out of order splice junctions precisely at normal exon boundaries, were present at levels comparable to their canonical linear counterparts. While the canonical linear transcripts were sensitive to RNaseR treatment in all tested cases, transcripts with scrambled exons were generally resistant to RNaseR treatment, as expected for circular RNAs, but not for RNAs with scrambled exons generated by template switching or other artifacts of reverse transcription from a canonical linear RNA. In an analysis of publicly available RNA-Seq data from HeLa and H9 human embryonic stem cells, we found that RNAs with scrambled exons were enriched in non-poly-adenylated fractions. Taken together, these findings provide strong evidence that most scrambled exon sequences we detected in human RNA were derived from circular molecules.
While we have found that hundreds of human genes express circular RNA isoforms, the limitations of our experimental design may actually have led us to underestimate the prevalence of circular RNA isoforms. First, the size selection step during sequencing library preparation would miss small circular RNAs, or highly structured circular RNAs with fragmentation kinetics incompatible with our size selection. It is also possible that our size selection procedure would enrich for circular RNAs in the selected size range that failed to be fragmented. Second, although our sequencing depth was adequate to detect diagnostic junctional reads for many circular RNAs, it may not have been adequate to accurately identify and quantify rare circular isoforms, and our search for exon-exon junctions was restricted to exons annotated in RefSeq, which we know to be an incomplete catalogue of exon boundaries. Thus, we may still be underestimating the number or prevalence of circular RNAs in human cells.
The previously unappreciated abundance and diversity of circular RNAs in human cells raises important questions: What is the molecular mechanism of circular splicing? Recent evidence suggests that canonical pre-mRNA splicing does not necessarily proceed in sequential order from the 5′ to 3′ end of the RNA
[24]. In this case, an orphan 3′ splice site upstream of the acceptor exon could serve as the acceptor site for a downstream 5′ splice site that is not paired with its canonical splicing acceptor, producing a circular transcript. Such a model is depicted in . A particular example of this model, wherein an alternative promoter causes transcription initiation within the first intron, creates an orphan 3′ splice site that is later used by a downstream 5′ splice site could result in a circular RNA with exon 2 as the acceptor. This model would be consistent with our finding of enrichment of exon 2 as the acceptor exon and was suggested in
[25].
How widespread and evolutionarily conserved is circular splicing? Our preliminary analysis of ribosomal-RNA depleted RNA from the mouse brain suggests there are hundreds of genes with scrambled isoforms in that organ; we would not be surprised to find that this phenomena is pervasive across the animal kingdom and perhaps more broadly. In addition, while we have not found evidence of differential circular isoform expression between tissues, this area should be further explored.
What functions do circular RNAs serve and what roles might they play in normal human biology or disease? The vast majority of circular RNA molecules we detected were transcribed from a gene that is also known to encode a conventional linear mRNA.
In vitro studies have shown that circular RNAs can be translated, raising the possibility that some circular RNAs might encode proteins with functions distinct from those of their canonical counterparts
[26]. A non-coding regulatory role is another distinct possibility.
Our data suggest there are at least a handful of known noncoding RNAs with circular isoforms. A circular isoform of the noncoding RNA ANRIL has been reported to correlate with INK4/ARF expression, as well as atherosclerosis risk, suggesting the possibility that circular RNAs might have some role in human disease. An antisense transcript of the CDR1 gene, relatively abundantly expressed in mouse brain, was recently shown to be circular, and to positively regulate the corresponding sense transcript
[27]. A role in regulating the pool of RNA binding proteins or small RNAs capable of interacting with the conventional linear RNA counterpart is another possibility
[28].
We cannot yet rule out the possibility that they are incidental to regulated splicing of a conventional linear RNA and perhaps accumulate due to their relative resistance to degradation. However, the high abundance of many circular RNA isoforms relative to their linear counterparts and lack of evidence for the predicted alternatively spliced linear RNA co-product suggest that circular RNA isoforms are not simply accidental byproducts of splicing. Further investigations of origins and activities of circular isoforms of mRNAs are likely to lead us in surprising directions.