|Home | About | Journals | Submit | Contact Us | Français|
Tyrosine kinases are important cellular signaling allosteric enzymes that regulate cell growth, proliferation, metabolism, differentiation and migration. Their activity must be tightly regulated, and disruption of this regulation can lead to a variety of diseases, particularly cancer. The non-receptor tyrosine kinase c-Src, a prototypical model system and a representative member of the Src-family, functions as complex multi-domain allosteric molecular switches comprising SH2 and SH3 domains regulating the activity of the catalytic domain. The broad picture of self-inhibition of c-Src via the SH2 and SH3 regulatory domains is well characterized from a structural point of view, but a detailed molecular mechanism understanding is nonetheless still lacking. Here, we use advanced computational methods based on all-atom molecular dynamics simulations with explicit solvent to advance our understanding of kinase activation. To elucidate the mechanism of regulation and self-inhibition, we have computed the pathway and the free energy landscapes for the “ inactive-to-active” conformational transition of c-Src for different configurations of the SH2 and SH3 domains. Using the isolated c-Src catalytic domain as a baseline for comparison, it is observed that the SH2 and SH3 domains, depending upon their bound orientation, either promote the inactive or the active state of the catalytic domain. The regulatory structural information from the SH2-SH3 tandem is allosterically transmitted via the N-terminal linker of the catalytic domain. Analysis of the conformational transition pathways also illustrates the importance of the conserved tryptophan 260 in activating c-Src, and reveals a series of concerted events during the activation process.
Protein kinases are signaling enzymes that serve to regulate cell growth, proliferation, metabolism, differentiation and migration1, 2. The human kinome encodes over 500 protein kinases, representing approximately 2% of the entire genome3. Owing to their critical functional importance, the activity of these protein kinases must be tightly regulated. Malfunction of protein kinases is often associated with a wide range of disease, including inflammation, diabetes, and particularly cancers.
A prototypical system of central interest is provided by the non-receptor tyrosine kinase c-Src, which belongs to the family of nine Src-related non-receptor tyrosine kinases (SFK) sharing a similar regulatory mechanism (Src, Yes, Fyn, Lyn, Lck, Blk, Hck, Fgr, and Yrk)4. c-Src was the first proto-oncogenic protein ever described5, 6. It is ubiquitously expressed in all cell types and is discovered to participate in multiple signaling pathways which regulate cell growth, differentiation, etc7. All nine SFKs share a common multi-domain topology, which comprises a short membrane-associating SH4 segment and a “unique” (U) segment whose sequence varies widely, followed by two peptide binding modules in the SH2 and SH3 domains, and finally a catalytic tyrosine kinase domain (KD), i.e., SH4-U-SH3-SH2-KD7–10. The KD comprises about 300 residues and is highly homologous to those of serine and threonine kinases10. The SH2 (~100 residues) and SH3 (~60 residues) domains are compact structures that bind to sequences containing a phosphotyrosine7 and to a poly-proline (PPII) helix containing the motif PxxP11, respectively12.
All of the Src-family kinases function as complex allosteric molecular switches whose catalytic activity can be modulated in response to specific cellular signals, such as growth factors or cytokines binding to membrane-bound receptor proteins. Phosphorylation of two distinct tyrosines has opposing effects on catalytic activity13. Phosphorylated Tyr527 (chicken c-Src numbering system) in the C-terminus tail of the catalytic domain interacts with the SH2 domain, resulting in an autoinhibition of kinase activity14, 15. Dephosphorylation of Tyr527, or the binding of external ligands to the SH2 or the SH3 domains, results in the activation of the kinase16–19. Phosphorylation of Tyr416, a residue located in the activation loop (A-loop) near the active site in the catalytic domain, is necessary for full kinase activity. Up-regulation of the kinase can be induced by auto-phosphorylation of Tyr41620, 21. An active Src kinase implies that the required conformational changes have taken place such that the KD has become catalytically competent. The down- and up-regulation of Src kinase activity are allosterically linked to the assembly/disassembly process of the tertiary structure of the protein.
Crystallographic structures of the auto-inhibited c-Src and Hck show that the SH3 and SH2 domains are assembled on the backside of the catalytic domain, with the SH2 domain binding to phosphorylated Tyr527 and the SH3 domain binding to the PxxP region of the linker connecting the SH2 and the KD22–25. The auto-inhibited conformation of the enzyme is stabilized by the SH2-SH3 domains, which appear to act as a “clamp” that locks the KD in the inactive state. Disruption of either the SH2-pY527 or SH3-linker interaction destabilizes the auto-inhibitory conformation, leading eventually to the activation of the enzyme. Phosphorylation of Tyr416 in the activation (A−) loop locks the KD in a catalytically competent conformation26, 27. Furthermore, a crystal structure with neither Tyr527 nor Tyr416 phosphorylated showed an active-like state with the SH2/SH3 domains in a re-assembled conformation on the top of the KD instead of on the backside28. In the aforementioned structure, the KD is in a crucial state for trans-autophosphorylation; Tyr416 is not phosphorylated but the KD still adopts a catalytically competent conformation. Although the conformational state of the KD is important for initiating the bimolecular trans-autophosphorylation reaction, the importance of the re-assembled SH2/SH3 domains conformation is poorly understood.
A reasonable hypothesis suggested by the available data is that the activating conformational transition within the KD is allosterically modulated, somehow, by the configuration of the regulatory SH2 and SH3 domains. Nevertheless, basic questions about the nature of the allosteric coupling between the various structural elements and the molecular mechanisms underlying the regulation of c-Src remained unanswered. For example, how is the SH2-SH3 “clamp” inhibiting the conformational transition of the KD toward a catalytically competent conformation in the assembled state? Does it act mainly by increasing the energy barrier separating the inactive and active states (kinetic control), or by raising the overall free energy of the active conformation to make it less stable (population shift)? And how the stability of the catalytically competent conformation enhanced by the SH2-SH3 tandem in the re-assembled state?
By answering such questions using all-atom molecular dynamics (MD) simulations, our goal is to clarify the mechanism by which the SH2 and SH3 regulatory domains control the activation of the kinase. Several MD simulation studies have studied the regulation of SFKs and focused on SH2 and the C-terminal tail binding29, the role of the linker30–32, allosteric network33, and the conformational changes and phosphorylations of the isolated KD26, 34–45. Pathway methods, umbrella sampling (US) calculations, and Markov state model (MSM) constructed from aggregate data from unbiased trajectories have proved to be effective in understanding the conformational changes in SFKs. An incomplete, though cogent picture, has emerged from the above computational studies, providing a broad framework for understanding the sequence of events leading to full kinase activation. According to this picture, it is proposed that the catalytic domain with Tyr416 in the A-loop unphosphorylated makes transient short-lived visits to an active-like catalytically competent conformation. These transient visits increase the availability of Tyr416 to be phosphorylated during a bimolecular encounter, a process that ultimately locks the enzyme in its catalytically competent state26, 27.
In the present effort, a full-length kinase atomic model comprising the SH3, SH2, linker, and catalytic domain was employed to study the role of the SH3-SH2 tandem on the inactive-to-active conformational transition taking place within the KD. We used a combination of targeted molecular dynamics (TMD), swarms-of-trajectories string method39, 46, and umbrella sampling (US) simulations to investigate the inactive-to-active transition pathway within the catalytic domain, and the associated free energy landscape for two different configurations of the regulatory SH3 and SH2 domains. To the best of our knowledge, this is the first computational study that investigates conformational changes in a full-length c-Src kinase. Our hope is that the computations will provide significant new insights to understand the function and regulation of tyrosine kinases.
The inactive assembled state and the active re-assembled state were modeled from two X-ray crystal structures, with the PDB codes 2src24 and 1y5728, respectively (see Figure 1A and 1B in Ref. 26). These two systems were prepared with the CHARMM-GUI47. A truncated octahedron box of TIP3P water molecules48 with a buffer of 10 Å from the protein to the edge was used to solvate the systems, using 0.15 M NaCl in addition to neutralizing ions. An ATP molecule and two magnesium ions coordinating the ATP were added based on a high-resolution structure of protein kinase A (PDB id: 1ATP49). Y416 in the A-loop is not phosphorylated in any of the construct, while Y527 is phosphorylated only in the assembled construct. The assembled, and re-assembled constructs comprise of approximately 66,000 and 148,000 atoms respectively. The isolated KD, construct is smaller and comprises approximately 38,000 atoms. All additional simulations were performed with the NAMD package50 version 2.10, using the all-atom CHARMM27 force field51. The solvent was minimized while holding the protein fixed for 1000 steepest descent steps, and followed up with another 1000 steps for the entire system. The minimized system was then brought to a temperature of 300K by initializing the velocities at 100K and increasing the temperature by 20K every picosecond (ps). Finally, a short equilibration run of 5ns was performed. All MD simulations, including the swarms-of-trajectories calculations, were performed in the isobaric-isothermal (NPT) ensemble. The pressure and temperature were controlled by Langevin piston method52 and Langevin dynamics (using a coupling constant of 1.0 ps−1) and kept at 1 atmosphere pressure and 300 K, respectively. Periodic boundary conditions were applied and the Particle Mesh Ewald (PME)53 was used to treat long-range non-bonded interactions. The short-range non-bonded interactions were truncated at 12 Å, with a switching function turned on from 10 to 12 Å. The non-bonded list updates were truncated at 14 Å. Covalent bonds involving hydrogen atoms were constrained at their equilibrium distances so that a 2 fs time step was used throughout all simulations.
The string method formulated by Vanden-Eijnden represents the pathway between two endpoints as an ordered sequence of M evenly distributed discrete conformations called “images” in a space of collective variables, [Z(1),Z (2), …, Z(M)]46, 54, 55. Starting from an initial path, we use the mean drifts of each image <ΔZ> calculated from swarms of unbiased MD trajectories to iteratively refine the string39, 46. At each iteration, the string is re-parametrized to impose an equal Euclidean distance between adjacent images in the space of collective variables Z in order to maintain a well-resolved path over the entire transition54. The swarms-of-trajectories string method was recently implemented into NAMD using the multiple copies interface56, allowing for extremely scalable computations. Here, a total of 64 images and 32 individual swarms per image were used to calculate the mean drift, for a combined 2048 simultaneous copies. These were distributed over the Blue Gene/P and Blue Gene/Q supercomputers at the Argonne Leadership Computing Facility. The re-parametrization phase used a smoothing constant of 0.1 and a staged harmonic restraint that increased the force constant on the CVs from 0 kcal/(mol·Å2) to 50 kcal/(mol·Å2) in 5 steps. The number of MD steps spent in the re-parameterization phase was set equal to the length of swarms. The progress of the swarms-of-trajectories iterations was monitored with the discrete Fréchet distance57 of the CVs between all pairs of strings. The Fréchet distance respects the ordering of the string as opposed to the Hausdorff distance, and emphasizes the maximal difference between two strings rather than an average like in the root mean squared distance (RMSD). By virtue of the thermal noise added from the swarms-of-trajectories, the algorithm samples the ensemble of plausible (high probability) paths between the two endpoints. For this reason, the last 20 iterations for each system were taken as a converged ensemble, so each of the 64 images along the string had 32 × 20 structures from which projections onto other degrees of freedom could be made. The projections were made using VMD58 and plotted using the Python Matplotlib package59. The structural images were rendered using the Tachyon ray tracing library or the PyMOL software package60.
During the activation process, the KD also undergoes important conformational changes involving primarily the αC-helix and the A-loop (where Y416 is located). This inactive-to-active conformational transition in the KD enables the catalytic function and has been the subject of our previous studies. To study the role of the regulatory tandem (i.e. SH3-SH2) on the inactive-to-active conformational transition in the KD, we constructed three strings in this investigation, with one in the absence of the regulatory tandem and with the tandem in the assembled and re-assembled states respectively. The initial guess for the inactive-to-active transition pathway for c-Src kinase was taken from our previous work on the KD.38 The same 50 Cartesian collective variables (CVs) from Gan et al.39 were used to describe the string, and represent the Cα position and a single atom in the amino acid side-chain (details in the Supplementary Material of Ref 39). The assembled inactive system from PDB 2src was then pulled along the string from the inactive to the active state by TMD, using a pulling rate of 10 Å/ns and a harmonic restraint of 10 kcal/mol on the CVs. A total of 64 equidistant structures were saved along the pathway and represent the initial assembled state string. The re-assembled state string was constructed in a similar manner using the string from our previous work on the KD, except the TMD was carried out in the reverse direction, going from the active (X-ray structure) to the inactive conformation.
Two sets of two-dimensional (2D−) US calculations were performed to characterize the free energy associated with the inactive-to-active conformational change of the KD when the SH3-SH2 regulatory tandem is in the assembled and the re-assembled state respectively. The order parameters utilized to monitor the inactive-to-active conformational transition were the same ones previously used to study the role of Y416 phosphorylation: the average distance of the three i : i+4 hydrogen bonds in the short α-helix of the A-loop (D413:O to T417:N, N414:O to A418:N, and E415:O to R419:N; d3) is used to reflect the opening of the A-loop. The difference between the distances of the E310 to R409 salt-bridge (d1) and E310 to K295 salt-bridge (d2) is used to characterize the rotation of the αC-helix, either facing the A-loop at negative values or the N-lobe for positive values. Graphical representations of the two order parameters and the conformational change subjected to our umbrella sampling calculations can be found in Figure 2 of Meng et al.26
The self-learning adaptive strategy61 was also employed in both sets of US calculations. To further reduce the computational cost and focus on the regions surrounding the minimum free energy pathway, the refined transition pathways for the assembled and re-assembled systems were used to seed the self-learning adaptive US calculations. The starting structures in the first cycle of the adaptive strategy were all provided by the converged strings, i.e. the last 20 iterations of the swarms-of-trajectories string method calculation. These converged strings were first projected onto the 2D- order parameter space. Umbrella sampling windows whose centers were within 1.5 Å to the projected strings are used in the first cycle, and the nearest image to an umbrella center was served as the starting structure of that particular window. The free energy threshold was adjusted on-the-fly, and a free energy threshold of 9.0 kcal/mol was finally utilized in the self-learning adaptive procedure.
A harmonic restraining potential with a force constant of 10 kcal/(mol·Å2) and a uniform spacing of 0.5 Å were applied on each order parameter. A total of 555 and 338 umbrella windows were used for the re-assembled and the assembled states, respectively. A total of 40 and 30 ns of MD propagation were carried out per US window for the re-assembled and the assembled states, respectively. The shorter simulation time per umbrella window for the assembled state mainly results from a faster convergence rate, compared with the umbrella sampling calculations of the re-assembled state (i.e., the SH2-SH3-linker tandem fluctuates in a greater extent in the re-assembled construct as shown in Fig. 3B, necessitating a longer simulation time to improve the sampling). In total, approximately 32 µs of aggregated sampling time was spent in our US calculations. The weighted histogram analysis method (WHAM)62 was used to combine the time-series of the order parameters from each set of 2D-US calculations and to yield the potential of the mean force (PMF).
An additional set of 2D-US calculations were carried out to construct the free energy landscape associated with the inactive-to-active transition of the KD in a construct consisting of the linker and the KD. The linker is restrained to adopt the same conformation as that in the assembled state using a harmonic potential with a force constant of 10 kcal/(mol·Å2) applied to its non-hydrogen atoms. Essentially, the purpose of this set of US calculations is to evaluate the role of the linker in the auto-inhibition of the KD. If the free energy landscape of activation from this set of US calculation is the same as that from the SH3-SH2-KD construct when the SH3-SH2 adopts an assembled conformation, the SH2-pY527 docking and the SH3-linker binding do not contribute directly to the auto-inhibition of the KD. All the 338 umbrella windows from the assembled construct were utilized. The SH3 and SH2 domains were removed. The same harmonic potentials were applied to the two order parameters as described above. Each umbrella window was propagated for 11 ns (3.7 µs of aggregated sampling time), while the last 10 ns were utilized for WHAM post-processing.
The activation pathway in the KD was determined the isolated KD, included here for the sake of comparison, and for two configurations of the SH2 and SH3 regulatory domains in the full-length kinase. Those configurations are the assembled state, which is illustrated by the X-ray structure with PDB code 2src24, and the so-called re-assembled state, which is illustrated by the X-ray structure with PDB code and 1y5728. The assembled state (2src) is believed to represent an autoinhibited inactive form of the kinase. In contrast, the status of the re-assembled state (1y57) is less clear. While Tyr416 in the A-loop is unphosphorylated, as would be expected for a kinase locked in a constitutively active state, the key residues in the active site are properly configured to allow catalysis. Our view, based on previous studies, is that the configuration of the KD in the re-assembled state represents a catalytically competent active-like conformation that can exist transiently prior being locked via the phosphorylation of Tyr416 in the A-loop.
As a first step, the four endpoint conformational states in the two strings were simulated without any bias and restraints for 50 ns to ascertain their stability before being used in the pathway refinement. Each string has one endpoint state that corresponds to a conformation observed by X-ray crystallography, namely, the assembled inactive kinase, and the re-assembled active-like kinase. In addition, each string has an artificial or virtual endpoint that has not been observed by crystallography, namely the re-assembled inactive kinase and the assembled active kinase. To validate the stability of the virtual re-assembled inactive and the assembled active endpoint states more extensively, they were further simulated for 507 and 514 ns, respectively. The stability of the conformations was monitored by time-series of the two CVs used in the umbrella sampling, shown in the supplementary information Fig. S1. For comparison, the 50 ns MD trajectories for the (crystallographically-observed) re-assembled active state and the assembled inactive state are also shown in Fig. S1. These trajectories remain centered on the X-ray crystal structure values and show a moderate amount of fluctuation. In contrast, the artificially created endpoints display larger fluctuations and adopt slightly different active and inactive states. They do remain in either the re-assembled inactive or the assembled active state, though, suggesting that at least on the 500 ns timescale these kinase conformations are stable, indicating that the re-assembled inactive state and the assembled active state can be used as endpoints in their corresponding strings.
It is known that the length of the short MD trajectories in the string method with swarms-of-trajectories can strongly affect the outcome, causing the pathway to either become stuck in local minima when the trajectories are too short, or depart considerably from the minimum free energy path (MFEP) when the trajectories are too long63. These observations were made for small systems such as the 2D Mueller potential or the blocked alanine dipeptide, and tests on larger systems have not yet been performed. The KD-only swarms-of-trajectories iterations performed by Gan et al. had utilized a swarm trajectory length of 1 ps39. To ascertain the sensitivity of the results to the length of the trajectories, the smaller, assembled full-length construct was selected and three independent iterations were run with swarm trajectories lengths of 1, 5 and 10 ps. The discrete Fréchet distance57 of each iteration to the initial string was used to monitor whether the string was becoming stuck in a local minima at the lower trajectory lengths. The result of this analysis is shown in Fig. 1a. The 1 ps swarm trajectory length started to reach a plateau well before the 5 and 10 ps swarm trajectory lengths, and we chose to stop the iterations early. In the case of the re-assembled state, the 5 ps trajectory length reached a plateau before the 10 ps iterations. The final strings from the 10 ps trajectory length iterations were taken to represent the most accurate transition pathway for each system. In light of this decision, the string for the isolated KD was re-optimized with a 10 ps trajectory length for the sake of consistency. The convergence of the string was monitored with the discrete Fréchet distance of each iteration to the final string, shown in Fig. 1b. The last third of both the 5 and 10 ps swarm trajectories for the assembled full-length construct shows a plateau at low Fréchet distances, indicating that the string has stopped evolving. In the case of the 10 ps re-assembled full-length and isolated KDs, there remains a gradual downward slope near the end of the iterations. The slope was deemed small enough to consider the strings converged. One can also study the Fréchet distance between every pair of string iterations, although more complicated than the general divergence from a single structure, and this is shown in the Fig. S2. Contiguous regions of low Fréchet distance show local minima of the strings and each of the three states show an extended region of low Fréchet distances for at least the last 20 iterations. Thus, the strings for the final 20 iterations of each state were treated as an ensemble describing the local minimum free energy path of the inactive to active transition.
The converged inactive-to-active transition string pathways for the assembled and re-assembled states and the isolated KD are broadly very similar. This is supported by monitoring the evolution of a set of representative order parameters following the inactive-to-active transition shown in Fig. S3, and the 2D the projection shown in Fig. S4. Generally, the differences in the paths are related to the differences at the endpoints. For example, the re-assembled active A-loop has a different W260 conformation, and the α-helix turn in the inactive A-loop between residues 413 to 419 (d3 in Fig. 2) is a little more structured. According to Fig. S3, there are also some differences regarding the timing of specific events along the string, although the overall ordering is generally preserved. For example, the distance d1 breaks around image index 44 for the assembled state, while it breaks around image index 50 for the re-assembled state, or the distance d2 forms around image index 48 for the assembled state while it forms around image index 58 for the re-assembled state. In both paths, the order of the microscopic event remains the same (d1 breaks and then d2 forms), although there is a slight shift of the image indices between the assembled and re-assembled state. The slight shift in image index is not directly prevented by the reparametrization procedure, which aims at keeping adjacent images equidistant. This is the main reason why we used the Fréchet distance in assessing the convergence of the string to try and avoid the shifting of image indices. Some difference involving the charged residues forming a network of electrostatic interactions are noted, and will be analyzed below.
Free energy landscapes determined from (US) calculations26, 31 and from Markov state models (MSM)38 can provide extremely valuable information to understand the conformational transitions underlying the activation of protein kinases. To investigate the allosteric role of the SH3-SH2 tandem on the auto-inhibition/activation of the KD, we calculated the potential of mean force (PMF) along two order parameters associated with the motion of the A-loop and the αC helix using US simulations for the assembled and the re-assembled full-length constructs. The results are displayed in Fig. 2a, 2b, and 2d. For comparison, the PMF of the isolated KD previously reported in Meng et al.31 is also reproduced in Fig. 2b. In all the 2D-PMFs shown in Fig. 2, the inactive KD at the bottom left and the active-like at the top right of the states consistently correspond to local minima, indicating that the endpoints represent meaningful and stable conformations. Nonetheless, the relative population of these endpoint states depends strongly on the context presented by the different configurations of the SH3-SH2 tandem. One can also observe that an intermediate state with the A-loop open and the αC-helix oriented outward (bottom right of the 2D-PMF), is only observed in the 2D-PMF of the isolated KD (Fig. 2b). The results, obtained from an isolated KD, can serve a model for the free energy landscape in a full-length kinase in which the SH3-SH2 tandem would be disengaged and decoupled completely from the KD.
The calculations demonstrate that the presence of the SH3-SH2 tandem in the assembled conformation imposes a significant thermodynamic penalty on the active conformation of the KD, as revealed by a comparison of Fig. 2a and 2b. After integrating the PMF of the isolated KD (Fig. 2b), the free energy of activation (G(activation) G(active-like) − G(inactive)) is 3.6 kcal/mol. However, the same quantity is ~8 kcal/mol in the presence of the SH3-SH2 tandem in the assembled conformation. The transition from the inactive conformation to the active-like one is essentially prohibited when the SH2-SH3 clamp is bound to the backside of the KD. This result reveals why the assembled configuration is truly autoinhibited. According to the crystallographic structures of the assembled c-Src23, 24, 64, the tandem does not block access to the catalytic site of the KD to neither ATP+Mg2+ nor the substrate, i.e. the KD can catalyze the transfer of the γ-phosphate from ATP to a substrate when it adopts an active conformation even though the regulatory tandem is assembled. The autoinhibition is thus believed to be caused by the inability of the KD to adopt the catalytically competent active conformation. The calculated free energy landscape for the assembled full-length kinase is in accord with this explanation. The molecular determinants of the mechanism of auto-inhibition are investigated further in the following section.
The most prominent result of moving the SH2-SH3 tandem to the re-assembled, configuration is a shift of the free energy minimum from the inactive toward the active-like conformational state. The calculated free energy G(activation) is −1.0 kcal/mol, even though Tyr416 is unphosphorylated. The re-assembled full-length kinase is, thus, expected to be intrinsically very active. However, the calculated free energy landscape indicates that the inactive conformation remains thermodynamically accessible, with an estimated population of ~16%. This means that a significant fraction of the structural ensemble is still in an inactive conformation, which explains why the phosphorylation A-loop is required to lock the kinase in the active state to achieve maximum activity26. In addition, we also observe a shift of the position of the free energy minimum corresponding to the active state to higher values of d3 relative to the assembled full-length kinase construct, consistent with the observations from in the unbiased MD simulations of the endpoints as shown in Fig. S1.
It is of interest to examine the converged string pathways, optimized in the multi-dimensional space of collective variables, projected onto the reduced 2D subspace of the PMF. The projection, displayed in Fig. S4, provides a combined view of the transition pathway and the underlying free energy landscape. From Fig. S4, it appears that all three strings roughly follow the minimum free energy path on this 2D free energy surface. It should be noted that the images of the string are no longer equidistant when projected onto a space of lower dimensionality. For instance, the clustering of images near the inactive, intermediate, and active states suggests that the pathway in this region is oversimplified by the 2D projection. The images are re-parameterized during the swarms-of-trajectories iterations to remain equidistant along the entire length of the string in the space of collective variables (Fig. S5), therefore, this clustering in 2D is due to movement along collective variables that are orthogonal to the two order parameters used to calculate the 2D-PMF. Furthermore, the 10 ps string of the isolated KD demonstrates that a small shift in d1 – d2 also occurs (Fig. S4a), and is investigated further below in the section entitled “Electrostatic Network”.
The linker region connecting the SH2 through the N-terminal end of the KD has long been identified as one of the important factors controlling the auto-inhibition of c-Src kinases. The impact of the linker region has been studied by site-specific mutagenesis experiments65–68, e.g. the L255V65 and W260A68 mutations in c-Src resulted in a highly elevated kinase activity. To identify the molecular determinants that inhibit the opening of the A-loop and rotation of the αC-helix, two events associated with the inactive-to-active conformational transition, we initially searched for residues within 5–6 Å from the αC-helix or the A-loop from the SH2 and SH3 domains. However, this search uncovered only D117 in the SH3 domain within a distance of 6 Å at which non-bonded interactions are negligible. The search was then extended to include the linker region, leading to the identification of Q253, L255, and W260. Interestingly, both L255 and W260 have been recognized to be crucial for auto-inhibition of the kinase. This simple analysis suggests that whereas the SH2 and SH3 domains, themselves, do not couple strongly to the KD, the linker region clearly plays a more critical role. This observation leads us to hypothesize that the role of the SH2 and SH3 domains is essentially to stabilize the N-terminal linker region in an “inhibiting conformation”, but it is the linker region itself that is actually responsible for the direct inhibition of the catalytic domain.
To test this hypothesis, we carried out an additional set of 2D-US calculations in which the SH3/SH2 tandem was removed while the conformation of the linker was restrained to that in the auto-inhibitory conformation. The resulting 2D-PMF, which is displayed in Fig. 2c, bears a striking resemblance (almost quantitative) to that of the assembled full-length construct (Fig. 2a). This similarity reveals that auto-inhibition of the KD is indeed completely attributed to the conformation of the linker N-terminal region. Consequently, we posit that the importance of the interactions between the SH2 domain and the pY527in the C-terminal tail, as well as between the SH3 domain and the PxxP motif in the linker in the assembled state is essentially to stabilize the N-terminal of the KD in its “inhibiting conformation”. To further validate this hypothesis, we also examined the flexibility of the N-terminal linker region in the 2D-US calculations of both full-length constructs. The 2D-PMFs of αC-helix rotation and the RMSD of the linker with respect to the corresponding X-ray structure are constructed and are displayed in Fig. 3 (note that this result was obtained with no additional US simulations). In the assembled full-length construct (Fig. 3a), it is observed that the deviations of the linker from the X-ray conformation are quite small in the assembled full-length construct, which is reflected by small values and the narrow range of the RMSD. In contrast, the linker is much more flexible in the re-assembled full-length construct (Fig. 3b), regardless of the conformation of the KD. This behavior confirms that SH2-pY527 docking and SH3-linker binding restrain the motion of the linker to assure the auto-inhibition is effective. Additionally, the linker demonstrates a greater flexibility when the KD adopts an active conformation in the re-assembled full-length construct than that when the KD adopts an inactive conformation in the same construct. This behavior further shows that the inactive conformation of the KD has an intrinsic propensity to restrain the motion of the linker.
A conserved tryptophan residue (W260 in c-Src) at the interface between the linker and the KD has long been identified to be critical in regulating the kinase activity of tyrosine kinases68, 69. For example, in Src-family tyrosine kinases such as c-Src and Hck, mutating this tryptophan to alanine elevated kinase activity, whereas the corresponding mutation in Tec-family tyrosine kinases such as Btk and Itk show an opposite effect70, 71. A comparison of the X-ray structures for Src kinases (see Fig. 4 and Table S1) shows that in the inactive state of both Hck and c-Src the W260 residue is buried in a hydrophobic cavity between the αC-helix and N-lobe. The active state of both c-Src and Lck shows the W260 rotated out and the cavity collapsed as the αC-helix moves closer into the N-lobe. The linker also changes its conformation, moving from a type-II β-turn between residues Ala256-Ala259 in the inactive state to a type-I β-turn between residues Asp258-Glu261 in the active state, which has drastically changed where the SH2/SH3 domains are with respect to the KD (Table S2).
A truncated linker and KD (residues 253–523) has been studied with unbiased molecular dynamics and umbrella sampling30, 31. The umbrella sampling of the truncated construct by Meng et al.31 confirms that the Ala256-Ala259 β-turn is stable when the kinase is in the active state, and adopts the Asp258-Glu261 β-turn when the kinase is in the inactive state. Furthermore, the free MD of the truncated constructs show that the backbone (, ψ) distributions for W260 are not affected by changing the linker orientation from inactive to active (reproduced as Fig. S6). The inactive state has a single, well defined (χ1, χ2) for W260 seen in both Hck and c-Src crystal structures24, 25, as well as the MD simulation. The active state has several possible conformations, one of which is (−75, 90).
The inactive-to-active conformational activation pathway from the string method allows us to examine the structural behavior of the N-terminal linker region in the context of the assembled or re-assembled state. Throughout the transition, the linker remains in either the Ala256-Ala259 β-turn, in the case of the assembled state, or the Asp258-Glu261 β-turn, in the case of the re-assembled state (Fig. S7). The backbone (, ψ) also remain constant throughout the transition, closely resembling the results of the unrestrained MD by Meng et al. (Fig. S8). Strikingly, it is only the conformational activation pathway for the re-assembled state that exhibits a rotation of the W260 side-chain (Fig. S9). A comparison of the snapshots along the conformational activation pathway for the assembled state in Fig. 4 shows that the W260 has been shifted out of the hydrophobic pocket to allow the αC-helix to rotate and move closer to the N-lobe. This is is expected in the active state, but here this was achieved by pushing the β1 strand out of position. This is presumably caused by the competing tension of the N-terminal linker being locked into the assembled state and the αC-helix rotation. The re-assembled orientation of the N-terminal linker is more flexible, though, and a rotation of the W260 side-chain is sufficient to move from the inactive hydrophobic pocket (red in Fig. 4c–d) to the active state (magenta in Fig. 4c–d).
A network of electrostatic interactions between charged residues that are altered upon activation of the kinase has previously been identified44, 45. These include the salt bridge between E310 and R409 or K295 (d1 and d2) interactions used in this study to monitor the rotation of the αC-helix. The progress of these interactions along the three different conformational activation pathways from the string method is shown in Fig. 5. The salt bridge between E310 and R409 occasionally broke during the equilibration of the inactive endpoint (Fig. S10), although this conformation is stable for tens of nanoseconds. The kinase-only inactive endpoint equilibration finished with the broken E310 to R409 interaction, and it did not reform for the endpoint during the swarms-of-trajectories iterations (Fig. S11). The re-assembled state inactive endpoint began with the E310-R409 interaction, but this broke after approximately 75 iterations without reforming. Regardless of whether the E310-R409 salt-bridge was formed at the inactive endpoint, in both the assembled and re-assembled states the salt-bridge was present during the early parts of the inactive-to-active transition, indicating that it is important during the opening of the A-loop.
While the pathways for the assembled and re-assembled states are broadly very similar (see Figs. S3 and S4), one notable difference concerns the E310-K295 interaction. The unbiased equilibration MD for the inactive endpoint shows little difference with both the assembled and re-assembled states having these residues separated by approximately 12 Å. However, the E310-K295 distance starts to decrease much earlier in the case of the re-assembled state than the assembled state, passing through a few intermediate states as opposed to the single switch seen in both the kinase-only and assembled states (Fig. 5b). There are small variations of the HRD-motif along the pathway for the assembled and re-assembled states (Fig. 5c–d). Of particular interest is the R385-Y416 interaction, which appears to be strengthened at the re-assembled active endpoint relative to the assembled and kinase-only systems. This is seen by the shift of the active endpoint equilibrium distribution (Fig. S10d), and the contact being fully formed at the activated KD endpoint for the string in the re-assembled state (Fig. 5d). After the phosphorylation of Y416, the R385-pY416 interaction is part of a triad of arginine residues that stabilize the active-like endpoint26.
In order to ultimately understand the activating conformational transition with the KD in atomic detail, we analyzed the progress over a set of representative order parameters along the string activation pathway. We focus the present analysis on the pathway in the case of the re-assembled state for the sake of simplicity. The pathway for the two other cases are fairly similar (Fig. S3), Each order parameter was normalized to show the end-point closest to the inactive state in blue, and the end-point closest to the active state in red. The resulting “progress plots” were then sorted manually and displayed in Fig. 6, allowing inference of the ordering of microscopic events along the activating conformational transition. As is to be expected from previous work39, the first step in the inactive-to-active transition is the breaking of the two short α-helices in the A-loop as it begins to open. This is followed by the breaking of the Arg419 interaction with the backbone carbonyl of Ala434 in the αEF-helix in the C-lobe, allowing the A-loop to completely unfold and form the first intermediate state. The side-chain of Trp260 then rotates out of the hydrophobic pocket involving the αC-helix, although the latter starts to move only later. The second intermediate state is formed, as Phe424 flips away from the αEF-helix and associates with the hydrophobic elements of the αG-helix. The final changes leading to the formation of the Lys295-Glu310 salt bridge typical of the catalytically competent conformation occur almost simultaneously: Glu310 breaks away from Arg409 and unlocks the interaction that anchors the αC-helix to the A-loop, pulling the αC-helix toward the N-lobe as it associates with Lys295. The released Arg409 flips away from the αC-helix and associates with Tyr416, as one of the triad of arginine residues (Arg385, Arg409, and Arg419) that will eventually stabilize the phosphotyrosine of the fully activated kinase. However only Arg385 and Arg409 are closely associated with the unphosphorylated Tyr416 in the active-like state observed in the re-assembled conformation. The hydrophobic regulatory spine consisting of M314, L325, H384, and F405 also locks into place72.
To understand the allosteric control of the SH2 and SH3 regulatory domains on the inactivation of the catalytic KD, we used a combination of computational methods to determine the inactive-to-active conformational transition pathway and the underlying free energy landscape under various assembly states of the multi-domain kinase. The impact of the regulatory tandem on activation is demonstrated by the results from umbrella sampling calculations: compared with the free energy landscape of the isolated KD from our previous study, the free energy landscape of the assembled full-length construct displays that the active conformation of the KD is further disfavored by more than 4 kcal/mol, essentially achieving an auto-inhibition of the KD. In contrast, the active-like state of the KD in the re-assembled full-length construct dominates the structural ensemble. Furthermore, our results also reveal that the linker region connecting SH2 and KD domains plays a critical role in auto-inhibition, and the SH2 and SH3 domains contribute to stabilizing the linker conformation.
This work is supported by the National Cancer Institute (NCI) of the National Institutes of Health (NIH) through grant R01-CAO93577. The computations are supported in part by NIH through resources provided by the Computation Institute and the Biological Sciences Division of the University of Chicago and Argonne National Laboratory under grant S10 RR029030-01, by the Extreme Science and Engineering Discovery Environment (XSEDE) grant number OCI-1053575, and by the University of Chicago Research Computing Center. The authors want to thank Dr. Matthew Pond and Dr. Avisek Das for many insightful discussions.
Table S1 lists the orientation of W260 in several Src-family kinase crystal structures. The dihedral angles of the N-terminal linker region β-turns in those crystal structures are shown in Table S2. Figure S1 displays the projection of the inactive and active endpoint equilibration simulations onto the CVs used in umbrella sampling simulations. Figure S2 shows the convergence of the swarm-of-trajectories string method calculations. The progress of twelve representative order parameters along the optimized string pathways is shown in Figure S3. The projection of the converged string onto the corresponding PMF is plotted in Figure S4. Figure S5 illustrates the root-mean-squared deviation (RMSD) between any pair of images in the final iteration. The backbone and side-chain dihedral angles for W260 from 100 ns of MD of the wild-type c-Src kinase with the truncated linker in either the inactive or active conformation is shown in Figure S6 (reproduction of Supplemental Figure S9 from Meng et al.31). Cα-to-Cα distance between the i : i+3 residues in the β-turns of the N-terminal end of the KD is plotted in Figure S7. Figure S7 plots the compiled histograms of the backbone dihedral angles for W260 in the two SH3/SH2 states along the converged transition ensemble. Figure S9 shows the compiled histograms of the side-chain dihedral angles for W260 in the two SH3/SH2 states along the converged transition ensemble. Histogram of the distances that consist of the electrostatic network during the equilibration runs are displayed in Figure S10. Glu310:CD to Arg409:CZ distance (d1) of the inactive endpoint is plotted against the index of string method iterations in Figure S11. The coordinates of the converged strings (the last 10 iterations) from the assembled state and the re-assembled state are also given in the supporting information. This material is available free of charge via the Internet at http://pubs.acs.org.
The authors declare no conflict of interest.