Knowledge of how intact human RPA engages ssDNA substrates and transitions between its various functional states has remained inaccessible using conventional structural techniques. Such dynamic modular proteins are challenging to analyse but can be characterized by small angle scattering and computational methods (40–42
). The combination of these methods enables interpretation of the low resolution SAXS data in terms of underlying structural models. The application of this approach to the binding of RPA to ssDNA provided information about (i) the transitions in the architectural ensembles of RPA as it binds ssDNA; (ii) hypotheses about the atomic interactions involved; and (iii) a first estimate of the ‘path’ followed by the ssDNA. Remodelling the architecture of modular multi-domain proteins is central to their assembly and function in the dynamic multi-protein machinery (41
). The approach used here to characterize RPA architecture in different functional states should be broadly applicable to these systems.
Our studies show that the binding of RPA on ssDNA results in a progressive compaction of the protein. Because RPA is a modular protein with flexible linkers, its architecture is best described in terms of an ensemble of population-weighted architectural states, which might include low populations of architectures that diverge significantly from the mean. The DNA-free protein samples a large range of architectures because of the flexibility between the 70A and 70B domains and between the 70AB module and the trimer core (A). The initial 8–10-nt interaction mode causes the 70A and 70B domains to compact significantly, as observed in previous X-ray crystal structures and SAXS studies of RPA70AB. A second major compaction occurs when the trimer core is engaged in binding ssDNA. Even binding to as few as 20 nt of ssDNA is sufficient to cause this transition. Notably, no further significant changes occur as RPA-DBC binds up to 30 nt of ssDNA. Overall, our results show that RPA-DBC occupies an ensemble of architectures in solution that can be best viewed as an equilibrium between a range of extended, intermediate and compact states with binding to ssDNA binding driving the equilibrium towards more compact conformations.
Figure 7. Models for modulation of RPA-ssDNA interactions. (A) Two major transitions in RPA architecture as it progressively binds ssDNA: the initial binding mode that compacts and aligns the 70A and 70B domains, the final binding mode that compacts the DNA-binding (more ...)
The binding of RPA to ssDNA is more complex than that predicted by a simple model of consecutive engagement of the four known DNA-binding domains. SAXS data showed that RPA-DBC favours more compact as opposed to fully extended architectures. The simulations suggested that the linker between the 70B and 70C domains may be critical to the preference for more condensed states, as the B–C linker was found to form transient helical structure and participate in a range of transient interactions with domains 70B and 70C as well as the DNA. We observed that the 10-nt complex sampled a range of architectures that was more restricted relative to the DNA-free protein. Detailed analysis of the SAXS and simulation results suggested that the extent of compaction may not be solely explained by effects on RPA70AB. However, the challenges of interpreting SAXS data for multi-domain proteins makes it difficult to discern if this observation is a by-product of significant inter-domain dynamics in RPA-DBC or if there is an allosteric effect of ssDNA binding on the B–C linker that inhibits sampling of fully extended RPA-DBC architectures. The SAXS data showed that binding of RPA-DBC to 20 nt of ssDNA engages the trimer core, but the simulations suggested that fixed binding to domains 70C and/or 70D is insufficient to explain the experimental data. And finally, we note the similarity observed in the SAXS data for the binding of RPA-DBC to 20, 24, 27 and 30 nt of ssDNA was not anticipated.
One key finding from our results is that the DNA-binding apparatus of RPA undergoes two architectural transitions as it binds to progressively longer stretches of ssDNA. This result is inconsistent with the prevailing view of three modes of RPA binding to ssDNA with distinct architectures (2
). Moreover, it has been long held that the 30-nt binding mode is fully extended, and the 10-nt binding mode is compact (2
). In fact, we find no evidence that the 10-nt binding mode is more compact than the 30-nt binding mode—our data show just the opposite. The previously reported compact architecture for the 10-nt binding mode was dependent on gluteraldehyde cross-linking of RPA in nucleoprotein complexes, whose structure was monitored indirectly by scanning transmission electron microscopy (44
). Treatment with cross-linker may have caused RPA to be trapped into compact architectures that could only bind 8–10 nt.
In addition to correcting the model for the initial 10-nt and final 30-nt binding modes, our results show there is no distinct architecture associated with the intermediate 12–23-nt binding mode. The evidence in support of this intermediate binding mode included photo-cross-linking of RPA subunits to DNA substrates and testing the DNA-binding capability of domain-specific point mutants (4
). Although these approaches establish the participation of a particular domain in ssDNA binding, they are indirect and do not reflect the structural organization of the molecule. Our direct physical analysis by SAXS is consistent with RPA having only two not three DNA-bound architectural states.
The architecture and binding of ssDNA by RPA are fundamentally different from that of the simpler and well-studied homotetrameric ssDNA-binding proteins (SSBs) found in prokaryotes (45
). Crystal structures of Escherichia coli
SSB reveal a compact back-to-back and base-to-base arrangement of the four OB-fold domains (46
). As a result, SSB DNA-binding clefts face opposite directions relative to each other, compelling ssDNA substrates to encircle the SSB tetramer to occupy two (35-nt state) or all four (65-nt state) binding domains (B). In contrast, flexible linking of RPA’s DNA-binding domains allows each binding surface to orient in a similar direction and enables the core to organize in a convex manner around the ssDNA. This opens up the possibility for RPA to function in two ways, either like SSBs by inducing ssDNA to organize around the protein or conversely, by adapting its architecture in response to the DNA (B). Our results support the latter. The greater versatility afforded by the flexible arrangement of OB-folds permits RPA to accommodate variations in the available length and binding context of DNA substrates across multiple DNA processing pathways. Such differences between prokaryotes and eukaryotes highlight the importance of modular structure for the more complex eukaryotic DNA processing machinery.
Investigations of the diffusion of bacterial SSBs along ssDNA suggest a mechanism for initiating displacement of SSBs from their ssDNA substrates (47
). In this model, transient thermal dissociation of single DNA-binding domains combines with limited re-association to the ssDNA. For RPA, 32D possesses the weakest affinity of the four DNA-binding domains (49
) and is the most plausible mediator of initial dissociation. Once 32D and 70C are disengaged, there can be an increase in the separation of the trimer core from 70AB (C). It is conceivable that this promotes diffusion along the ssDNA and thereby enables access to the substrate. However, our observation that RPA-DBC does not significantly populate highly extended architectures suggests the relatively close proximity of 70AB and the trimer core likely promotes subsequent re-engagement of the trimer core onto the DNA (C).
If the 32C and 70D domains are readily re-engaged, how is it possible for RPA to be displaced from ssDNA? We have previously reported that the interaction of RPA with other DNA processing proteins within the DNA processing machinery can promote or inhibit its binding to ssDNA [e.g. (50
)]. In our model, these interactions shift the equilibrium between multiple relatively compact RPA architectural states, facilitating binding or release of ssDNA. Bacterial SSBs have been proposed to use interaction with their disordered C-terminal tails to promote release of ssDNA (51
). The primary protein recruitment modules in RPA are 70N and 32C, which are independent structural modules. However, in most, if not all, cases, RPA binding partners interact via multiple contacts, also engaging the 70AB structural module. The interaction with 70AB alone could promote dissociation from the ssDNA either via direct competition for the DNA binding sites or allosteric effects from binding to the A–B linker (50
). However, these interactions are invariably much weaker than 70AB interaction with ssDNA, and it is difficult to imagine they are sufficient to cause RPA to dissociate.
Our data suggest an alternative mechanism: the involvement of RPA70AB and one or both of the 70N and 32C protein interaction modules places steric or other constraints on RPA architecture that lead to increased
separation of 70AB and the trimer core (C). This increased separation would amplify intrinsic dissociation of the trimer core from the ssDNA, lower the overall affinity, and increase the probability of diffusion along the ssDNA and/or provide access to a competing DNA-binding domain from the binding partner (C). The increasing recognition that RPA and other DNA processing proteins interact through multiple contact points [e.g. SV40 Tag helicase (50
), polymerase α- primase (54
), XPA (31
), Rad52 (31
)] suggests that this mechanism for protein-mediated displacement of RPA from DNA could be conserved across DNA processing pathways. It is anticipated that this mechanism for obtaining access to DNA will be generally applicable, in light of the increasing recognition of the importance of architectural remodelling in DNA processing machinery (58–60
A crystal structure of the RPA-DBC homologue from the fungus Ustilago maydis
in complex with dT32
in two space groups was published during the final revisions of this manuscript (61
). A key feature in this structure is the collapsed quaternary structure driven by contacts between RPA70B, RPA70C, 10 of the B–C linker residues and the ssDNA intervening between the nucleotides bound in the B and C domains. Comparison with the SAXS curves of human RPA-DBC bound to 10–30 nt of ssDNA indicates the conformational ensemble in solution differs from the highly condensed crystal structure (Supplementary Figure S12
). However, crystal-like compact architectures may be populated in solution. Notably, interfaces in the crystal packing are larger than the interfaces within the DBC in the structure. This observation suggests that this architecture is not likely to be highly populated in solution, which is consistent with our SAXS data.
Overall, the SAXS/MD and crystallographic studies provide highly complementary information. For example, both support condensation of the architecture of RPA on binding ssDNA and place the four binding sites for ssDNA along a curved trajectory on one face of the DBC. The crystal structure provides a DBC structure at near-atomic resolution. It also highlights interactions involving domains B and C, the B–C linker and the nucleotides between those bound in domains B and C, which are also found in the MD simulations. On the other hand, the SAXS/MD and crystal structure data provide two different models for RPA function: (i) in the crystal structure, keystone interactions centred on the B–C linker create a unique four-way interface that stabilizes a closed conformation for 30-nt bound RPA, whereas (ii) in the SAXS/MD data, a dynamic two-state condensation on ssDNA binding is defined with considerable flexibility. Importantly, the flexibility in our model naturally incorporates interactions with multiple RPA partners whose modular interfaces may stabilize a given functional architecture in concert with DNA binding. Our SAXS and computational results in combination with the crystal structures marks a new era in understanding architectural remodelling of RPA. Testable hypotheses can now be generated for the roles of the B–C linker, allosteric compaction of the DNA binding core promoted by the binding of DNA and architectural changes induced by partner protein binding, as well as a framework for understanding how RPA functions in so many DNA processing pathways.