In prokaryotes, a single DNA-dependent RNA polymerase (RNAP) synthesizes all classes of RNAs including mRNAs, rRNAs, and tRNAs. Prokaryotic core RNAPs are composed of five subunits (α
2, β′, β, and ω) which associate with a σ factor to form the holoenzyme required for promoter recognition (
26). In eukaryotes, RNAP I synthesizes the rRNAs and RNAP II forms the mRNA and the small nuclear RNA, while RNAP III is responsible for the synthesis of the tRNA and the 5S rRNA.
RNAPs I, II, and III contain 14, 12, and 17 subunits, respectively. These three enzymes are functionally and structurally related; five subunits are common to all three enzymes, while another four are related (
32,
198). The two largest subunits of prokaryotic (bacterial RNAP β′ and β) and eukaryotic RNAPs (RNAP I Rpa190 and Rpa135, RNAP II Rpb1 and Rpb2, and RNAP III Rpc160 and Rpc128) share a high degree of sequence similarity and form the catalytic center of the enzyme. The multisubunit archaebacterial RNAP is composed of 13 polypeptides, the three largest subunits being the homologues of the two largest subunits of the eukaryotic and prokaryotic RNAPs (
101). Six other archaeal subunits show sequence similarity with bacterial and eukaryotic RNAP polypeptides.
The resolution of the crystallographic structures of multisubunit RNAP has provided a framework for elucidating transcriptional mechanisms. To date, high-resolution structures are available for the bacterial RNAP and the eukaryotic RNAP II alone (
7,
8,
28,
44,
116) as well as part of a complex with nucleic acids (
57,
87,
118,
194,
195) or regulatory factors (
11,
29,
86,
119,
185) (see Table and Fig. for models).
Both the prokaryotic and eukaryotic transcription reactions involve a number of steps, starting with promoter recognition. The bacterial σ factor (
60) and the eukaryotic general transcription factors TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH (
42,
62) are required for specific binding of the polymerase to promoters. In both systems, promoter binding is accompanied by bending and wrapping of the promoter DNA against the body of the polymerase (
55,
100,
142-
144). DNA wrapping is involved in promoter melting in the region of the transcriptional initiation site, probably through the induction of a torsional strain that generates unwinding of the DNA double helix (
50,
54). On the basis of recent site-specific protein-DNA photo-cross-linking experiments (
54), it was proposed that formation of a right-handed loop in the promoter DNA wrapped around a mobile portion of the enzyme, named the clamp domain, induces DNA unwinding, which can then be stabilized by some general transcription factors. The formation of this open complex allows pairing of incoming ribonucleoside triphosphates (NTPs) to the template DNA strand for phosphodiester bond formation (
43,
69,
70).
It has been proposed that all polymerases catalyze phosphodiester bond formation by a two metal ion mechanism (
166) (Fig. ). By this mechanism, a first Mg
2+ ion, termed metal A, facilitates the nucleophilic attack of the 3′ oxygen on the 5′ α-phosphate. The second Mg
2+ ion, metal B, facilitates the release of the pyrophosphate. In prokaryotic RNAP and eukaryotic RNAP II, metal A is coordinated by three strictly conserved aspartates of β′/Rpb1 contained in the NA
DF
DG
D motif (
44,
57,
206). Metal B has a low apparent affinity for free RNAP II (
44) and appears to enter the active site with the incoming NTP and is coordinated by three aspartates, two from β′/Rpb1 and one from β/Rpb2, located in a conserved ED motif (
195). Formation of a phosphodiester bond is followed by translocation of the nucleic acids in order to present the next template register for a second nucleotide addition cycle.
Transcription initiation is characterized by a cycle of abortive initiation events, where RNAP synthesizes and releases small transcripts without disengaging from the DNA template (
188). When the transcript reaches a length of approximately 10 to 12 nucleotides, stabilization of an early elongation complex occurs and RNAP breaks its contacts with the general transcription factors/σ and clears the promoter for elongation (
43,
69).
In the elongation phase, RNAPs require the help of protein factors to bypass impediments to elongation. Replacement of the general transcription factors by various elongation factors has been demonstrated for RNAP II (
134) and numerous elongation factors have been identified to date (
159). These factors affect transcription in various ways from rescue of elongation complexes stalled at pause and arrest sites to phosphorylation of the C-terminal domain of the Rpb1 subunit of RNAP II and chromatin remodeling and modification.
Because mRNA processing occurs cotranscriptionally in the RNAP II system (
68), the complexity of the network of factors interacting with the elongating enzyme is much greater than in the RNAP I and III systems. Recruitment of both mRNA processing and elongation factors is mainly directed through specific interactions with the phosphorylated RNAP II C-terminal domain. Nonetheless, evidence supports the existence of elongation factors during transcription by RNAP I (
95) and RNAP III (
113), which lack a C-terminal domain. In prokaryotes, the number of characterized elongation factors is smaller, the best-characterized being GreA, NusA, and Mfd (
23).
The termination process is different for prokaryotic RNAPs than for eukaryotic RNAPs I, II, and III. Termination in prokaryotes is the better understood. Two major mechanisms have been proposed. The first is independent of termination factors but rather requires formation of stem-loop structures in the transcribed RNA that interacts with the RNAP to induce pausing and which, combined with a weak RNA-DNA hybrid, stimulates transcript release from the template (
140). The second mechanism involves the activity of the rho helicase to facilitate transcript release and is therefore referred to as rho-dependent termination (
140).
Eukaryotic RNAP I termination, although poorly understood, is for some genes dependent on a short repeated sequence element (
93) that is recognized by a nuclear termination factor called Reb1 in
Saccharomyces cerevisiae and TTF1 in the mouse (
162). Termination depends on the interaction of this protein with RNAP I and could involve some protein-induced changes in DNA structure. Other types of signals have been identified but to date (
85,
98,
187), no specific termination element has been unequivocally identified.
RNAP II termination is more complex because it is coupled to 3′-end processing of the transcript (
39). It requires association of the 3′-end processing complexes CPSF and CstF (
13,
47) with RNAP II via the phosphorylated C-terminal domain of Rpb1 as well as the polyadenylation signals at the 3′ end of the pre-mRNA (
41,
102,
109).
Many cleavage and polyadenylation factors have been identified and studies of thermosensitive alleles of cleavage factors show that some of them are required for RNAP II termination (
21). RNAP III termination most resembles bacterial RNAP termination and involves recognition of short termination signals rich in T stretches (
136). However, in contrast to prokaryotes, RNAP III termination does not involve the formation of stem-loop structures and no auxiliary factor has been identified to date.
Crystallographic studies have revealed that the structure of RNAP is highly conserved from prokaryotes to eukaryotes, with the regions of highest homology forming the enzyme's active center (
44,
206). The two largest RNAP subunits (β′ and β in bacterial RNAPs; Rpb1 and Rpb2 in RNAP II) form a positively charged cleft (also termed the main channel) that accommodates the nucleic acids during transcription (
44,
206) (Fig. ). The two catalytic Mg
2+ ions, metals A and B, are buried deep in this cleft (Fig. ) (
195). A domain of Rpb2/β called the wall closes the upstream extremity of this cleft and is a binding site for the upstream end of the RNA-DNA hybrid. The wall domain contains the flap feature, ordered in the prokaryotic structures and disordered in the eukaryotic structures, which serves as a binding site for transcription factors and would be implicated in the obstruction of the RNA exit channel by the σ factor region 4 (
123).
The nucleic acids are held in the cleft by a number of protein domains, including the upper and lower jaws, which grab the downstream DNA, and by the clamp domain, which locks the nucleic acids in the cleft (
57). Comparison of the structures of the free core enzyme (
44) with that of the elongating core enzyme (
57) from yeast led to the hypothesis that the mobile clamp domain folds over DNA to lock it during elongation. However, the backbone models (
7,
28) and refined atomic model (
8) of complete RNAP containing Rpb4 and Rpb7 show the clamp domain in the same position as in the elongating core structure, suggesting that promoter DNA is first loaded on the top of the clamp as seen for bacterial RNAP (
7,
28,
118). The template strand would then reach the active site only after promoter melting had occurred. A pore structure (also termed secondary channel), which has the shape of an inverted funnel that opens in the cleft near the active site, was proposed as an entry route for the incoming NTP and an exit for the released pyrophosphate (
44,
57). The mRNA exit channel lies between the wall and the clamp domain (Fig. ).
In addition to their polymerization activity, DNA-dependent RNAPs can also catalyze the 3′ endonucleolytic cleavage of transcripts under certain circumstances (
40,
158). First discovered in
Escherichia coli (
92,
172), this cleavage activity takes place when RNAP has backtracked at pause and arrest sites (
53,
196). Weakly associated DNA-RNA hybrids can induce backtracking of the enzyme to a more stable register (
125). To resume elongation from this more stable position, RNAP needs to cleave the 3′ end of the transcript that now extrudes from the catalytic site through the pore.
The 3′ cleavage activity is enhanced by the cleavage factors GreA and GreB in bacteria (
24,
25) and TFIIS in eukaryotes (
53,
196). The backbone structure of TFIIS in a complex with complete RNAP II (
86) provided insight into the mechanism by which this factor enhances transcript cleavage. More specifically, it revealed that two conserved essential acidic residues of TFIIS (
79) are located in the vicinity of the polymerase metal B and could participate in its coordination. Therefore it was suggested that the two metal ion mechanism invoked for NTP polymerization is also involved in the 3′ endonucleolytic activity of RNAP II. Similar studies on GreB, the prokaryotic homologue of TFIIS, have also reached similar conclusions for bacterial RNAP (
128,
164).
Structures of yeast elongating core RNAP have revealed the presence of an 8- to 9-base-pair DNA-RNA hybrid in the cleft (
57,
194). The crystallographic data have also shown that a number of loops and helices of Rpb1/β′ and Rpb2/β are located in this region close to the catalytic site (Fig. ). These structural motifs have been named either according to their location, aspect or presumed role in the transcription reaction. For example, the Rpb1/β′ bridge helix separates the main channel and is located near the template DNA at the +1 site.
Because crystallographic data revealed two different conformations for this structure, bent (or distorted) (
206) or straight (or continuous) (
44,
57,
194), it was proposed that the bridge helix might be involved in translocation of the nucleic acids during transcription. The location of the rudder, the lid and the fork loop 1 suggests that these loops are involved in DNA-RNA strand separation, in order to maintain an 8- to 9-base-pair hybrid (
194). The fork loop 2 and the zipper could be involved in delineating the downstream and upstream boundaries of the transcription bubble respectively (
44,
57). Five loops of Rpb1 and Rpb2 have been termed the switches and could participate in controlling the position of the clamp or, for switches 1 to 3, in forming a binding site for the DNA-RNA hybrid (
7,
44,
57).
Crystallographic studies of yeast core RNAP in complex with the general transcription factor TFIIB have shown contacts of this factor with the dock domain (
29) of the enzyme. This domain of Rpb1 is located between the wall and the clamp and lies on the surface of the structure (Fig. ). The finger domain of TFIIB enters deep into the active site after passing across the saddle, between the wall and clamp domains. This feature is reminiscent of the 3.2 linker loop of the σ factor in prokaryotic RNAP, which follows a similar path through the RNAP flap and clamp domains (
185).
Superposition of the crystallographic structures of the core RNAP/TFIIB complex and the core elongation complex (
57) shows that the RNA would clash with the TFIIB finger domain in the active site beyond synthesis of the fifth residue and that TFIIB, if it were not displaced, would compete with RNA for binding to the saddle after synthesis of the 10th nucleotide. These observations provide a molecular basis for abortive initiation before synthesis of the 10th nucleotide and explain why TFIIB is released from the complex beyond synthesis of this register.
Crystallographic structures of RNAPs in complexes with known catalytic inhibitors have also been resolved and have brought new insight into the catalytic mechanisms of RNAPs. Observation of these inhibitors at their binding sites allows a better understanding of their mode of action and the role of the structural features they interact with. The structures of cocrystals of α-amanitin (
27), rifampin (
30), sorangicin (
31), and streptolydigin (
176,
182) with core RNAP have all been resolved. α-Amanitin, an inhibitor of the translocation step, is seen binding to a region located between the funnel and the bridge helix (
27) of yeast RNAP II (Fig. ).
Many observations support the hypothesis that restriction in the movement of the bridge helix is required for translocation of nucleic acids in the active site (
61,
184). In prokaryotes, rifampin and sorangicin both bind in a pocket juxtaposed to fork loop 2 (
30,
31). Modeling of an RNA-DNA hybrid in a bacterial RNAP structure (
90) combined with positioning of rifampin to its binding site reveals that a steric clash would occur after synthesis of the second or third nucleotide (
30), which is in good agreement with observations that rifampin and sorangicin specifically inhibit RNAP during initiation at a step before addition of the second or third nucleotide (
90,
115).
An additional hypothesis about the mode of action of rifampin has been suggested and involves an allosteric signal transmitted through the binding site of rifampin to the catalytic aspartates in order to lower the affinity for the Mg
2+ ion (
12). The location of streptolydigin in the RNAP structure and its binding to the fork 2, bridge helix, and trigger loop features indicate that this drug could interfere with the translocation of nucleic acids in the active site (
182). However, an alternative hypothesis suggests an allosteric mechanism trapping RNAP in a conformation inappropriate for transcript elongation (
176).
Recent reports have enlightened the evolutionary connection between prokaryotic and eukaryotic RNAPs (
75,
76). The presence of the catalytic double-psi β-barrel domain, which contains a signature metal-coordinating motif, in both eukaryotic RNA-dependent RNA polymerases and the universally conserved β′ subunit of DNA-dependent RNA polymerases, coupled to the absence of other common domains, suggests that they have evolved through the early divergence of a common ancestor. Interestingly, the presence of another, although highly diverged, double-psi β-barrel domain in the β subunit of DNA-dependent RNA polymerases supports this idea of a common ancestor that diverged through lineage-specific insertions of domains and motifs.
During the past two decades many mutational studies of RNAP have been performed with the aim of understanding the function and regulation of this enzyme. A great number of these studies were performed before the resolution of RNAP crystal structures. We surmised that reexamining the effect of amino acid alterations in RNAP in light of their available three-dimensional structures might provide new insight into the structure-activity relationship of these enzymes.
We therefore created an extensive catalogue of published mutations affecting the function of multisubunit RNAP and mapped them on the structures of the prokaryotic and eukaryotic enzymes. This catalogue is presented in the form of a table (see Table S2 in the supplemental material for mutations discussed in this article and additional mutations; the complete catalogue is also available in a Web format at
http://www.ircm.qc.ca/microsites/mutationsaffecting/fr/index.html) indicating (i) the reference in which the mutation is described, (ii) the name of the mutant allele when available, (iii) the organism and subunit, (iv) the position of the altered amino acids on the linear amino acid sequence of the altered subunit, (v) the equivalent position on either
S. cerevisiae RNAP II or
Thermus aquaticus RNAP (the amino acid numbering in the text and figures refers to these specific positions), (vi) the name of the structural domain where the altered amino acid is located, (vii) the homology region, and (viii) the function affected by the altered amino acids or phenotype for 6-azauracil sensitivity or drug resistance (blanks stand for mutants with phenotypes for which no affected function could be attributed). This work, as presented in this review, has allowed us to make a series of observations concerning the functions of domains and structural elements of RNAP.