Here we used lysine–lysine crosslinking, mass spectrometry, and modeling based on crystallographic structures, to unravel the domain architecture of Pol I. The data allowed for an extension of the previous model of the Pol I core, confirmed the location of the A14/43 subcomplex on the core and positioned the remaining subunit A12.2 and the two domains of the peripheral subcomplex A49/34.5 on the core (). From these data, a view emerges that Pol I is evolutionarily related to a partial Pol II–TFIIS–TFIIF–TFIIE complex. The relationship extends to Pol III, which contains the A12.2-related subunit C11 that is required for RNA cleavage (
28), a heterodimeric subcomplex, C37/53, which is related to TFIIF and A49/34.5 (
29–31), and an additional subcomplex, C82/34/31, which is related to TFIIE (
32,
33). Below we discuss the three key findings from this work.
First, the C-ribbon of A12.2 can reside in the Pol I pore and corresponds to the C-ribbon of TFIIS, although it is also homologous to the C-ribbon of Rpb9. The A12.2 C-ribbon contains the charged residues R102, D105 and E106 at its tip, which correspond to TFIIS residues R287, D290 and E291, which complement the Pol II active site and induce strong RNA cleavage (
9). These results are consistent with recent mutagenesis data indicating that the corresponding Pol III subunit C11 also enters the pore with its C-ribbon to induce strong RNA cleavage (
6). The results thus explain the role of A12.2 and C11 in transcript cleavage (
28,
34), suggest a close evolutionary relationship between Pol I and Pol III, and confirm that Pol I and Pol III differ from Pol II in their mode of RNA cleavage (
6).
Second, the A49/34.5 dimerization module is located on the lobe of Pol I, and the linker of subunit A49 reaches into the central cleft of the enzyme. These findings are consistent with the localization of the corresponding TFIIF dimerization module on the lobe of Pol II and the TFIIF linker in the cleft (
10,
11). Thus not only the structures of the dimerization modules of A49/34.5 and TFIIF are similar but also their locations on the cores of Pol I and Pol II, respectively. Likewise, the C37/53 dimerization module binds the Pol III lobe, as shown by cryo-EM (
25,
35) and photo-crosslinking (
31). The conserved location of the dimerization modules in all three polymerases is consistent with a similar function of the TFIIF-like subcomplex in transcription (
10,
29). The observed stimulatory effect of the dimerization module on RNA cleavage (
3,
4) can now be explained as an indirect effect resulting from its proximity to subunit A12.2, which is essential for cleavage (
3) and likely tends to dissociate when A49/34.5 is lacking. This model is supported by the observation that deletion of C37 from Pol III leads to a loss of C11 (
30).
Finally, the data indicate that the mobile A49 C-terminal tWH domain can reside above the cleft. This position is similar to that of C34 in the Pol III system, as revealed by cryo-EM (
25,
35). Bioinformatic analysis (
36) and homology modeling (
4) suggested an evolutionary relationship of C34 and the A49 tWH domain to the β subunit of TFIIE. Since TFIIE crosslinks to the clamp of Pol II (
26), the A49 tWH domain, C34 and TFIIE can all adopt similar locations on their respective polymerase cores. Consistent with this, the A49 tWH domain binds single-stranded DNA and may have a role in promoter binding and/or opening (
4). Since the Pol III subunit C82 also binds single-stranded DNA (
33) and the archaeal TFIIE homolog TFE stabilizes an open promoter complex (
37,
38), we suggest that the distantly related A49 tWH domain, the Pol III subcomplex C82/34/31 and TFIIE share an old function in binding the melted DNA region above the active center cleft in an open promoter complex during initiation. Prior loading of the DNA into the cleft may be enabled by the observed mobility of these proteins.
Comparison of the crosslinking data presented here with previous EM data on Pol I strongly suggests that the A49/34.5 subcomplex, like its counterpart TFIIF, maintains a considerable degree of mobility on the polymerase surface. In a cryo-EM reconstruction at 12 Å resolution, densities were observed spanning from the funnel of Pol I to the AC19/40 heterodimer, consistent with some crosslinks described here for the A49 linker and the A34.5 tail, respectively (C), but did not reveal densities for the dimerization module on the lobe (
3). An early EM investigation at lower resolution provided evidence for A49 and A34.5 over the cleft (
39), although at that time an assignment was not possible. These observations can be reconciled with the mobility of A49/34.5. The two structured domains of this subcomplex are mobile but have preferred locations on the Pol I surface in solution, which are detected by crosslinking and by EM at low resolution, but not by EM at high resolution, where mobile surface structures generally get blurred or disappear entirely. Taken together, the present study provides the complete structural architecture of Pol I at the level of protein domains, explains the function of surface domains, and further elucidates the evolutionary relationships between the three eukaryotic RNA polymerases.