In the past decade, genomic research has focused on protein-coding genes. However, recent studies have revealed a myriad of non-coding transcripts in different organisms [1
]. While protein-coding transcription represents the output of only less than 2% of the human genome, more than 70% of the genome is transcribed [3
]. The unexpected level of transcriptome complexity has led to the suggestion that non-coding RNAs (ncRNAs) comprise a previously hidden layer of genomic programming and that ncRNA-directed regulatory circuits underpin complex genetic phenomena in eukaryotes [4
]. A large portion of reported ncRNA transcription occurs in the proximity of protein-coding genes, particularly at promoters but also at ends of coding regions and intragenically. In this review, we focus on the generation and functional consequences of non-coding transcription initiating from bidirectional promoters.
Divergent, bidirectional transcription of protein-coding genes has been recognized for many years [5
], and the more recent sequencing and annotation of whole genomes has revealed that bidirectional organization of coding genes is common [6
]. Further technological advancements over the last few years, however, have revealed an abundance of previously unobserved non-coding transcripts near the promoters of protein-coding genes in different organisms, including bacteria [8
], yeast [12
], fruit fly [16
], mouse [17
], human [19
] and plants [20
To understand the diversity of ncRNAs found near promoters, one must consider the technologies used to detect them as they bias our accounts of ncRNAs. The study of transcripts at promoters in turn provides information about the frequency of bidirectional transcription, the control of directionality in bidirectional promoters, and the importance of transcriptional regulation mediated by promoter-associated transcription. Here we comment on the main features of promoter-associated transcripts and technologies for detecting them (detailed further in and Box 1
Table 1 Examples of non-coding transcripts associated with protein-coding genes detected in different studies (See ). Note that different technologies measure different aspects of transcription; thus, the transcripts listed here may include some redundancy. (more ...) Box 1. Different technologies used to study bidirectional transcription
Enabled by the development of new genomic technologies, an abundance of new transcripts has been described (reviewed in [1
] and [2
]). These technologies interrogate gene expression at multiple steps, from the initiation of transcription to the final transcripts produced.
- Detection of transcriptional activity. Nuclear run-ons measure the density of transcribing RNA polymerases over a genomic region. This is done by allowing transcriptionally engaged polymerases to resume elongation in the presence of labeled nucleotides while preventing new transcription initiation events from occurring. The labeled RNAs produced during this run-on reaction are thus indicative of the elongation activity of the polymerases. The genome-wide implementation of nuclear run-on, the global run-on (GRO-seq), yields a snapshot of the occupancy of active, engaged polymerases in a strand-specific manner independent of transcript length or stability .
- Detection of nascent RNAs. Immunoprecipitation of RNAs bound to RNA polymerase II followed by sequencing (NET-seq) allows strand-specific measurements of engaged RNA polymerase II . This provides an RNA polymerase II occupancy profile, independent of the polymerase’s ability to elongate. Thus, nascent transcripts attached to paused or backtracked polymerases are also measured.
- Whole transcriptome profiling. Analyzing total RNA, polyA-enriched RNAs or rRNA-depleted RNA by strand-specific tiling arrays or RNA-seq is the most widely used method to study transcriptomes [3, 13, 15].
- Detection of short RNAs. Purification of short (18 to 200 nt) RNAs enables their profiling independently of full-length RNAs. This subpopulation of transcripts has varying origins, from short products of transcription to fragments produced post-transcriptionally from longer molecules. This approach has been carried out using both tiling arrays  and sequencing [18, 22, 23].
- TSS sequencing. Sequencing of the 5’ terminal part of capped mRNA transcripts has been used to detect active promoters [29, 30]. The sequencing of capped transcripts permits genome-wide mapping of TSSs and thus identification of bidirectional promoters. This can be combined with the identification of transcription termination sites [97–99] to define both transcript boundaries .
- Profiling of RNA degradation mutants. Knocking down components of the RNA degradation machinery has been applied to enrich samples for short-lived RNA molecules [13, 14, 24]. These RNAs could, in some cases, be precursors of the short RNA molecules detected with other techniques.
Different technologies demonstrate the bidirectionality of promoters
The extent of ncRNA transcription has been investigated using different approaches, including measurements of steady-state RNA levels [13
], RNA polymerase activity [25
], and nascent RNAs associated with RNA polymerase II [26
]. These studies have resulted in a catalog of non-coding transcripts near protein-coding genes and have revealed both protein/non-coding and non-coding/non-coding bidirectional transcription ().
Examples of pervasive bidirectional transcription
A large body of evidence for promoter bidirectionality derives from studies that have used expression profiling of steady-state RNA levels, targeting either all transcripts or specific subpopulations of different lengths and stabilities. A widely used approach in higher eukaryotes is to specifically profile short RNAs that are otherwise difficult to distinguish from overlapping long transcripts. This has revealed short transcripts of varying sizes located on both strands near promoters [17
] (, ). Transcripts also differ in their stability. As transcript abundance in a cell is dependent on the rates of both synthesis and decay [27
], some transcripts are not detected by RNA profiling of wild-type cells due to their short lifetimes. Manipulation of the RNA degradation pathways has been used to uncover these so-called cryptic unstable transcripts (CUTs) that usually initiate from shared promoters [13
]. In addition to profiling transcripts with specific size and stability, bidirectional transcription has been studied using protocols that capture 5’ termini of transcripts and reveal transcription start sites (TSSs) [29
Evidence for pervasive bidirectional initiation from promoters also derives from studies that directly measure transcription. These studies have been performed either by detecting all nascent transcripts (that is, those attached to RNA polymerase) or by targeting only transcripts that are being elongated. Global run-on sequencing (GRO-seq), a method that measures actively elongating RNAs (Box 1
), has revealed RNA polymerase-mediated transcription in two directions at 55% of human promoters, with only a small bias in orientation [25
]. Promoter bidirectionality has also been shown in yeast by sequencing nascent transcripts that co-purify with RNA polymerase II (NET-seq, Box 1
A systematic application of these diverse technologies will provide the most comprehensive information necessary to characterize non-coding transcription at promoters in terms of transcript type and number. Clearly, some of the non-coding transcripts detected reflect distinct biological entities (e.g. transcripts with characteristic biogenesis and degradation mechanisms (CUTs [13
], PROMPTs [24
]). Other transcripts may not represent independent transcription events but rather post-transcriptional processing products of longer ncRNAs generated from the same genomic locus [22
]. Furthermore, some of the observed diversity is likely due to differences in detection techniques. Therefore, the capacity of each technology should be taken into account when comparing results. For example, NET-seq cannot identify nascent transcripts close to the TSS as they are too short to map uniquely to the genome, while in GRO-seq these transcripts are elongated during the run-on reaction, allowing them to be mapped. In another example, when profiling steady state transcript abundance for specific sizes, one should note that the different transcript lengths detected could be arbitrary subpopulations from a continuum of sizes. Nevertheless, the diversity of the applied approaches shows that transcriptional initiation at most, if not all, promoters occurs in a bidirectional manner, even if the final transcription products are mainly observed in one orientation.
By studying different steps of the transcription process, and measuring specific subpopulations of transcripts, one can obtain complementary insights into the activity and regulation of bidirectional transcription. Thus, recent genomic developments substantiate the pervasiveness and illustrate the complexity of bidirectional transcription. These observations coupled with the anticipated functional implications of ncRNAs on genome regulation display the need for further investigation of transcriptional initiation at promoters.