|Home | About | Journals | Submit | Contact Us | Français|
To the Editor: Proteins exploit the conformational variability of loop regions to carry out diverse biological tasks including molecular recognition and signal transduction. New algorithms to engineer these functions by combining loop building and sequence design therefore have enormous practical applications but require high-resolution ‘loop reconstruction’: the modeling of protein loop conformations, given the amino acid sequence. Loop reconstruction in protein design may be simplified conceptually by restricting changes to the functional loop regions. However, despite progress in loop prediction methods1,2, design applications are limited by the difficulty in modeling purely local conformational moves and by the need for advances in sampling and evaluating loop conformations.
Here we address these challenges with a robotics-inspired local loop reconstruction method for peptide chains, called kinematic closure (KIC). Calculating the accessible conformations of objects subject to constraints, such as determining the possible positions of the interior joints of a robot arm given fixed positions for the shoulder and fingertips, has been well-studied in inverse kinematics, a subfield of robotics. Building on the first3 and subsequent applications (Supplementary Methods) of kinematics to protein modeling, the KIC method presented here analytically determines all mechanically accessible conformations for 6 torsions of a peptide chain of any length, while simultaneously sampling the remaining torsions and N-Cα-C bond angles using polynomial resultants4 (Fig. 1a, Supplementary Methods and Supplementary Fig. 1). To enable a range of applications, we coupled KIC to the Rosetta method for protein structure modeling5. Our loop reconstruction protocol iterates KIC calculations as Monte Carlo moves first with loop backbone minimization in a low-resolution stage, in which side-chains are represented as centroids, and then in a high-resolution all-atom stage with minimization of the loop backbone and all side chains in the loop environment (Supplementary Fig. 2 and Supplementary Methods). At the beginning of each KIC simulation, we discard all native loop bond lengths, bond angles and torsions. In addition, we perform reconstructions without knowledge of native side-chain conformations in both the loop and the protein scaffold (Supplementary Methods), which makes prediction substantially more challenging but broadens the range of applications to designing new loop conformations that may interact differently with neighboring side chains.
We found that KIC substantially improves model accuracy over the standard loop building method in Rosetta, which combines insertion of torsion segments from homologous proteins and a numerical closure technique6. We generated 1,000 models by the KIC method and compared its performance to the standard Rosetta method with the same number of Monte Carlo steps on twenty-five 12-residue protein loops (dataset 1; ref. 7). For each protein, we computed the root mean squared (r.m.s.) deviation of the backbone atoms of the best scoring loop model to the crystallographic loop, after superimposing the non-loop regions of the model onto the crystal structure. The KIC protocol frequently sampled regions of conformational space that were <1.0 Å from the crystallographic loop, which were not sampled by the standard Rosetta method (Fig. 1b). In the majority of cases (15/25), the best-scoring models were very close to the crystallographic loop conformation (Fig. 1b, c). Over the entire 25-loop set, KIC improved the median accuracy to 0.8 Å r.m.s. deviation from 2.0 Å r.m.s. deviation when we applied the standard Rosetta method (Supplementary Table 1). As both methods use the same scoring function, these results suggest that KIC increased accuracy by improved conformational sampling (although sampling and scoring errors cannot be considered entirely independently as scoring guides the simulation trajectories; see Supplementary Discussion, Supplementary Tables 1–8, and Supplementary Figs. 3, 4 for additional analysis of method performance and error sources). The standard method required ~280 central processing unit hours per protein, and KIC required ~320.
To compare KIC loop reconstruction directly to the state-of-the-art molecular mechanics method1, we applied the Rosetta KIC method and standard Rosetta method to the same twenty 12-residue starting structures with perturbed loops and side-chain environments used to assess the molecular mechanics method (dataset 2 (ref 1); Fig. 2a). The Rosetta KIC protocol improved the median accuracy to 0.9 Å from 1.2 Å using the molecular mechanics method, and from 2.0 Å using the standard Rosetta method (Fig. 2b and Supplementary Tables 2 and 5).
Functional loops in signaling proteins in complex with their partners exhibit conformational plasticity against a relatively structured core. To assess the ability of KIC to model such regions, we applied the method to interface loops from 4 proteins crystallized with 18 different partners (dataset 3; Supplementary Methods). KIC reconstructed the loops to 0.8 Å median r.m.s. deviation (Fig. 2b). Notably, the KIC protocol produced high-accuracy reconstructions of the same switch protein loop adopting different conformations when bound to different partners (Fig. 2c, Supplementary Table 3 and Supplementary Discussion). This result highlights the potential of KIC for modeling functional conformational changes. Sub-angstrom loop reconstructions by the local robotics-inspired sampling protocol described here could be coupled with the Rosetta design method5 to model and engineer protein loops precisely matching a particular binding partner, creating highly selective protein interfaces.
The described state-of-the-art loop reconstruction method is available free of charge as a module of the academic release version 3.1 of the Rosetta program for protein modeling and design at http://www.rosettacommons.org/.
We thank D. Baker and A. Sali for valuable comments, and B. Sellers and M. Jacobson for sharing data and for helpful discussions. This work was supported by grants from the US National Institutes of Health to T.K. (PN2-EY016525) and E.A.C (R01-GM08171), the University of California Lab Research Program (T.K.), and a PhRMA Foundation Predoctoral Fellowship (D.J.M.).