|Home | About | Journals | Submit | Contact Us | Français|
Molecular Dynamics Flexible Fitting (MDFF) is an established technique for fitting all-atom structures of molecules into corresponding cryo-electron microscopy (cryo-EM) densities. The practical application of MDFF is simple but requires a user to be aware of and take measures against a variety of possible challenges presented by each individual case. Some of these challenges arise from the complexity of a molecular structure or the limited quality of available structural models and densities to be interpreted, while others stem from the intricacies of MDFF itself. The current article serves as an overview of the strategies that have been developed since MDFF's inception to overcome common challenges and successfully perform MDFF simulations.
Cryo-electron microscopy (cryo-EM) has been playing an increasing role in structure determination in recent years. While the most widely used method for acquiring structures of biomolecules is still X-ray crystallography, crystallization of large biomolecules, macromolecular complexes, and membrane proteins is challenging and limits the method. Cryo-EM single-particle reconstruction does not require the difficult crystallization step and allows structures to be imaged in physiological conditions, though only after rapid freezing of the samples.
X-ray crystallography and cryo-EM often reveal different levels of macromolecular structure. Generally, X-ray crystallography produces atomic-resolution structures (<~3 Å), while cryo-EM densities are resolved typically only at lower resolution (~5-30 Å). However, through recent developments in cryo-EM techniques, high-resolution (~2-5 Å) densities have been obtained [1, 2, 3, 4, 5]. Computational methods that combine information from crystallography and cryo-EM can bridge the resolution gap and hold the promise of generating physiologically accurate, atomic-resolution structures of biomolecular complexes.
Various methods that combine X-ray crystallography structures and cryo-EM densities for structure determination have been developed in recent years. Some of these methods use rigid-fragment fitting [6, 7, 8], while others such as DireX , Flex-EM , Rosetta , and FRODA  perform flexible fitting, allowing conformational changes to better shape the structure to the data. Some approaches include the use of low-frequency normal modes , deformable elastic networks , and cross correlation  or least-squares difference between experimental and simulated maps  to drive the structure into the cryo-EM density. Some fitting methods use a Monte Carlo-based approach , while others such as the present authors’ Molecular Dynamics Flexible Fitting (MDFF) [16, 17] use molecular dynamics (MD) to match structures to multi-modal (crystallography and electron microscopy) data.
The present article concerns MDFF, which has proven to be successful in the hands of its developers as evidenced by applications to solving structural models for the ribosome [18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30], photosynthetic proteins [31, 32], myosin , chaperonins , bacterial chemosensory array , and virus capsids [35, 36, 37], including the first all-atom structure of the HIV capsid . MDFF has played an even greater role in structural modeling efforts outside of its developer group, namely for the ribosome and its substrates [38, 39, 40, 41, 42, 43, 44, 45], the actin-myosin interface , the Mot1-TBP complex , the 26S proteasome [48, 49], and the HIV-1 virus [50, 51].
The main advantage of MDFF over other fitting methods is that MDFF produces models with improved structural geometries, due to its use of the most advanced MD force fields . Compared to other fitting methods, MDFF has also been shown to produce models with better stereochemistry  due to the force fields and stereochemical restraints  used during MDFF. However, the emergence of high resolution cryo-EM maps poses a challenge to MDFF which was initially developed for low-resolution densities in the range of 10 – 25 Å. De novo fitting methods, such as the one employed by Rosetta, have been shown to perform better than MDFF at higher resolutions (< 5 Å) . New methodologies are being developed to allow MDFF to overcome the challenges posed by high-resolution maps, as discussed in Section 6.
In this article, we will review the MDFF method and how it has been developed and applied in the years since its creation. A basic protocol will be outlined and employed to highlight specific problems and their solution. An earlier guide  covers the basic MDFF workflow; the present article reviews new features and advances introduced since the earlier publication. Details of the application of the features discussed here can also be found in several tutorials listed and discussed in Section 7.
The essence of MDFF is, given an initial all-atom structure (available through crystallography or modeling) and a corresponding cryo-EM density, to match the structure to the density by means of an MD simulation. For this purpose, the structure is first rigidly docked into the density, then flexible fitting is performed by applying to the structure, in addition to the intrinsic (molecular dynamics force field) potential VMD, an external potential VEM(r), obtained by inverting the cryo-EM density and bounding the resultant map Φ(r) from below a threshold Φthr, to prevent fitting to noise, and scaling the potential by a user-defined factor ζ as a control over the strength of the VEM over VMD. Specifically,
where Φmax is the maximum of Φ(r). In addition, an atom-dependent weight wi, where i indexes atoms in the structure, is applied to vary the coupling from atom to atom, such that the potential experienced by atom i at position ri is wiVEM(ri). In practice, wi is commonly set to the atom's atomic mass so that the acceleration due to MDFF forces is uniform amongst all the coupled atoms.
MDFF produces an atomic-resolution structure in the conformation captured by the cryo-EM density by combining all-atom structural information, encapsulated in the initial structure and obtained from structure prediction algorithms or experiments like X-ray crystallography, with EM density information. The MD-based nature of MDFF allows for flexibility and sampling, while maintaining a realistic structural geometry through incorporation of the most advanced force fields in the simulation that define the potential VMD introduced above.
At its simplest, MDFF requires, as initial data, a complete all-atom structure and the corresponding cryo-EM density map. In preparation for the MDFF simulation, the MDFF potential VEM is calculated from the cryo-EM map, and several MDFF-specific parameters are set. These parameters include the subset of atoms within the structure to be coupled to the MDFF potential, the scaling factor ζ, and per-atom weights wi. The user can leave these parameters at their default settings which couple all non-hydrogen protein or nucleic atoms, a scaling factor ζ = 0.3, and per-atom weights wi equal to the atoms’ masses.
The potential energy function is defined on a 3-D grid and incorporated into the MD simulation using the gridForces feature of NAMD [55, 56]. Forces are computed from the added potential and applied (in addition to the intrinsic MD forces) to each atom depending on its position on the grid using an interpolation scheme. The computed VEM-derived forces drive the atoms into regions of high density, producing an atomic-resolution structure in the conformation captured by the cryo-EM density. Restraints imposed during the simulation help preserve the secondary structure, stereochemical correctness , and symmetry  of the protein investigated. The MDFF plugin and graphical user interface (GUI) (see Section 2.4) in VMD  allows one to perform the aforementioned tasks in a user friendly manner to set up an MDFF simulation in NAMD .
A thorough description of the basic MDFF technique is given in [16, 17]. Since its inception, MDFF has undergone further developments to address challenging cases. Application of MDFF in such cases is the main focus of the present publication. The MDFF user is instructed to recognize possible difficulties that may arise in their own system demanding structure analysis tasks; available solutions to address these difficulties are outlined.
An important step in any modeling workflow, including MDFF, is checking the stereochemistry of the structure for errors. This step is important both at the beginning of modeling, before MDFF is used, to correct errors before they have the potential to propagate, and after the use of MDFF, to ensure the validity of the final fitted structure. One class of stereochemical errors involves the two enantiomer forms (D-/L-) of amino acids around the chiral centers of their Cα (except for glycine) and Cβ (for threonine and isoleucine only) atoms. Most naturally occurring amino acids are found in the L- configuration, however some D-amino acids exist, e.g., in bacterial cell walls [59, 60, 61]. Another class of stereochemical errors involves the conformation of the peptide bond between two amino acids. The dihedral angle ω described by Cα,n, Cn, Nn+1, Cα,n+1, distinguishes between the cis (ω ≈ 0°) or trans (ω ≈ 180°) isomers . The trans isomer is energetically more stable and, therefore, is the one more commonly found in nature, except in some cases, e.g., peptide bonds before a proline residue [63, 64].
Assignment of the incorrect form for the chiral center or peptide bond configuration results in stereochemical errors. Unfortunately, there has been an increasing occurrence of such errors, particularly non-proline cis-peptide bonds, found in X-ray crystal structures (especially ones solved at low-resolution) . Not only can unchecked errors in the initial structure propagate during MDFF , but they can hide worse errors such as incorrect fit to the density which are not always found through other measures such as Ramachandran outliers .
VMD provides plugins, namely cispeptide and chirality, which can be used to detect, visualize, and correct stereochemical errors . A new plugin released in VMD 1.9.3, TorsionPlot (Fig. 1), can similarly be used for marginal and outlier Ramachandran angles. These plugins allow users to easily find errors in their structures and quickly fix them, using short MD simulations to properly equilibrate the structure and improve the geometry, or in more difficult cases, using interactive MD (Section 4) to correct the conformation of the errant regions. Once the errors have been corrected, restraints can be defined for the peptide bond and chirality configurations, as discussed in Section 2.2. Since VMD version 1.9.3, the cispeptide, chirality, and TorsionPlot plugins can work with the newly-developed MDFF Graphical User Interface, discussed in Section 2.4.
During MDFF, various restraints can be applied to the system, primarily to avoid overfitting the structure to noise in the density map. The most common restraints are applied to a set of internal coordinates relevant to the secondary structure of the macromolecule in its initial conformation, using the ssrestraints plugin of VMD [16, 17]. For proteins, restraints are applied to the ϕ and ψ dihedral angles and hydrogen bonds involving backbone atoms of amino acid residues in helices and β-sheets. For nucleic acids, dihedral angles and interatomic distances between base pairs are restrained. The strength of these restraints can be adjusted during the fitting process. For example, the strength can be decreased as the simulation progresses and large-scale motions have stopped, making way for local refinement of the structure.
A change in isomerism or chirality, as discussed in Section 2.1, during equilibrium simulations at room temperature is unlikely due to a high energy barrier (e.g., 21 kcal/mol in CHARMM22 ). However, the fitting forces derived from the potential VEM that are applied during MDFF, especially during initial structure optimization, or other refinement schemes involving high-temperature simulated-annealing, may be strong enough to introduce stereochemical errors into a structure. To avoid such errors, harmonic restraints can be applied to enforce the original isomerism of the peptide bond between residues or the chiral center located at the Cα and Cβ of certain residues using the cispeptide and chirality plugins, respectively, of VMD.
While some form of secondary structure, chirality, and cis-peptide restraints should be used for most MDFF simulations, there are other restraints available which are application-dependent. Domain restraints can be used to maintain non-overlapping, rigid, user-defined domains of a system during MDFF simulations with the Targeted MD (TMD) [67, 68] feature of NAMD. In TMD, a subset of atoms in the simulation is guided towards a final ‘target’ structure by means of steering forces. At each timestep, the root mean-square (RMS) distance between the current coordinates and the target structure is computed (after first aligning the target structure to the current coordinates). The force on each atom is given by the gradient of the potential derived from the RMS distance, such that the atoms are driven towards the selected target. For use as domain restraints, the target coordinates are set to the initial coordinates, thus keeping the domain rigid. Although the restraints keep each domain in the system rigid, they allow for flexibility between domains. Domain restraints are useful when fitting structures that undergo large-scale conformational changes with a ‘hinge-bending’ or ratcheting (as in the case of ribosome translocation ) motion where the local structure remains otherwise unchanged (Fig. 2).
The quality of the final model in any fitting method, including MDFF, is highly dependent on the quality of the initial model and the cryo-EM density map. The lower the resolution of a map, the less information it contains. Thus, the greater the degrees of freedom provided for the fitting procedure. Incorporating any additional knowledge of the system into the fitting procedure can reduce the search degrees of freedom and provide better results. One such structural trait to exploit, which is common in many biological systems, is non-crystallographic symmetry. Many large biological macromolecules have inherent structural symmetry, being composed of a few distinct subunits, repeated in a symmetric array. Examples are the poliovirus exhibiting icosahedral symmetry  or potassium channels exhibiting 4-fold symmetry .
In MDFF, symmetry restraints are available to make use of symmetry information . These restraints are calculated by overlapping selected symmetric subunits, calculating an average structure from them, and applying harmonic forces on the atoms restraining them to that average structure. Similar to the other restraints, the strength of the forces can be adjusted and either be set constant or to increase linearly with time. Increasing the strength only gradually allows for greater flexibility to explore the conformational space of the density map initially before converging to a symmetric structure. In addition to improving the quality of fit and agreement between subunits in low-resolution maps, such as in the case of the bacterial chemosensory array , symmetry restraints can also be used to prevent edge-distortion e ects (Fig. 2). Edge-distortion e ects occur when fitting a smaller number of subunits into a much larger EM density. Because the density represents more subunits than are actually being simulated, some subunits will be adjacent to empty density belonging to a subunit not represented in the limited simulation. Such subunits at the edges of the system can be pulled into the adjacent empty density and become distorted if the neighboring subunits, actually present in the real system, are not represented through symmetry restraints. The problem of missing subunits beyond the limited simulation geometry could be solved by cutting the density map to only the regions around the simulated structure, however such a cutting procedure is prone to bias. Another solution is to simulate the entire system, but this solution can be computationally costly and unnecessary, as a small number of subunits could be su cient to model the desired properties. For example, in a nitrilase from R. rhodochrous J1 [72, 57], a two-turn helix model is su cient for determining how interdimer interactions give rise to the stability of a spiral structure. Therefore, only two turns of the helix model needed to be simulated while using symmetry restraints to prevent the edge subunits from distortion .
Due to the aqueous environment of the cell in which many biomolecules are found, properly describing the solvent effect on the solutes is an important step in any MD simulation, including MDFF. In the gold standard for accuracy, the explicit solvent method, individual water molecules are represented by atomic models such as TIP3P . While accurate, inclusion of explicit solvent in a MD simulation is computationally demanding because the interactions for all of the solvent molecules must now be calculated in addition to those computed for the solute (e.g. proteins, nucleic acids, and ligands). Furthermore, explicit solvent introduces viscous drag  which slows the movement of the solute in the system, causing MDFF simulations to require more simulation time to produce a final fitted structure. Therefore, MDFF simulations are often performed in vacuum, i.e., without any solvent.
While simulations in vacuum are much faster than explicit solvent, they are less accurate, especially for solvent accessible regions on the exterior of a structure . However, the constraints imposed by the cryo-EM derived potential and the additional restraints (e.g., secondary structure) prevent the structure from becoming excessively deformed, so many initial MDFF simulations can still be run in a vacuum environment.
One compromise between the accuracy of the explicit solvent model and the speed of vacuum simulations is use of an implicit solvent model. MDFF can use the generalized Born implicit solvent (GBIS) model of NAMD , a fast approximation for calculating the electrostatic interaction between atoms in a dielectric environment described by the Poisson-Boltzmann equation. Not only is this approximation faster to compute than explicit water interactions, but the GBIS algorithms in NAMD are also parallelized for CPUs and GPUs for improved performance. While not as fast as vacuum simulations, MDFF with GBIS produces more accurate final models  in approximately the same simulation time due to the absence of viscous drag effects (implicit solvents form a continuum through which model components move as easily as through vacuum). In general, it is adviseable to use at least the implicit solvent model, especially in cases where significant interactions with solvent may affect the shape of the macromolecule, like in case of the 26S proteasome .
MDFF simulations are set up and analyzed using many different tools in VMD . Many of these tools are plugins (e.g. mdff, cispeptide, and chirality) accessed through VMD's command line interface, the TkConsole. Since VMD version 1.9.2, a new graphical user interface (GUI) for MDFF (Fig. 3) is available, providing a unified gateway for quicker and easier access to all plugins used by MDFF. The GUI, found in the ‘Modeling’ section of VMD's ‘Extensions’ menu, is divided into several sections pertaining to different aspects of MDFF. Both MDFF and xMDFF (Section 3.1) simulations can be set up with a variety of user-defined parameters and all required NAMD files (e.g., restraints and configuration files) are automatically generated. Additionally, the MDFF GUI can be used to set up, run, connect to, and analyze interactive MDFF and xMDFF (Section 3.1) simulations (Section 4). The ‘IMDFF Connect’ tab of the MDFF GUI (Fig. 3B) can be used to monitor the real-time quality of fit of a user-defined selection of the structure, using new e cient cross correlation algorithms (Section 5). This instant feedback on whether or not the structure is fitting well to the target density helps guide a user in manually applying interactive steering forces during the simulations. Since VMD version 1.9.3, the MDFF GUI works in conjunction with the cispeptide, chirality, and TorsionPlot structure checking plugins (Section 2.1). The structure checking plugins can be used to analyze the structure for errors, and then information on the selected residue can be sent directly to the MDFF GUI for automatically setting up a simulation to fix the error.
Successful structural modeling by MDFF depends on the quality of the intial model. If no or only incomplete structural models from experiment are available, computational approaches must be employed to obtain and refine missing structural information. Several options for building initial models are laid out in this section.
Investigating the structure of large biomolecular complexes poses a serious challenge to traditional crystallography techniques. Inherent flexibility of large systems, presence of disordered solvent and lipids or ligands often cause crystals to diffract at low resolution. Even overall high-resolution X-ray data may contain low-resolution regions that remain unresolved. When low-resolution X-ray data is available for the starting structure, a variant of MDFF for low-resolution X-ray crystallography (xMDFF)  can be used to refine the structure and possibly fill in missing pieces. For use with low resolution X-ray crystallography, the MDFF protocol is modified to work with model-phased densities, from the X-ray crystallography program PHENIX [77, 78], which uses the phases ϕ from a tentative model and the amplitudes |F| from the X-ray diffraction data. Next, the tentative model is flexibly fitted into the electron density map using MDFF. The MDFF-fitted structure provides new phases which, together with the experimental diffraction amplitudes, is used to regenerate the electron density. The fitted structure is then used as an updated model to be driven into the new density map. This process is repeated iteratively until the Rwork and Rfree reach a minimum or become lower than a predefined tolerance.
xMDFF has been successfully applied to solve the structure of a voltage-sensing protein, Ci-VSP . In this case xMDFF was used to resolve uncertainty in the low-resolution (3.6, 4, and 7 Å) data in regard to the placement of the important S4 helix involved in the conformational change required for the function of the protein. xMDFF has also been extended to small molecule crystallography, a field not typically explored by macromolecular crystallography programs . Nevertheless, when small molecules come together in multi-molecular assemblies, their structural characteristics closely resemble biomolecules and, thus, can greatly benefit from macromolecular refinement techniques such as xMDFF. A small abiological molecule, cyanostar, exhibited whole molecule disorder and pushed the limits of small-molecule crystallography. However, a hybrid xMDFF-PHENIX approach was able to successfully refine the cyanostar structure and identify multiple conformations contained within the crystal . Importantly, a key contribution to the success of the project was the development of accurate force field parameters using the force field toolkit (ffTK) plugin in VMD .
If no X-ray or NMR data is available, structure prediction methods can be used to generate the initial model. Homology modeling, which takes direct advantage of data stored in the PDB, is the most widely-employed approach for structure prediction. The application of homology modeling techniques requires, as a prerequisite, the availability of at least one protein structure of similar amino acid sequence . Working under the generalization that structural homology is highly correlated with sequence homology, models are constructed based on alignment of the two sequences and mapping of the homologous template structure to fill the missing regions of the target structure. The quality of models produced by these means are then evaluated based on specific structural and energetic criteria to eliminate erroneous results . There are various programs available, which are suitable for comparative modeling, like MODELLER , SWISS MODEL , I TASSER , and MUFOLD . In particular, MUFOLD was used to supply an initial model for refinement with xMDFF (Section 3.1) to obtain the structure of a voltage-sensing protein .
An alternative approach, which also uses knowledge from the PDB, involves building a local fragment library based on the target amino acid residue sequence. Implemented in Rosetta , this strategy uses a “Monte Carlo” method guided by a knowledge-based scoring function to exchange and place the fragments into a partial model . Many of the structures deposited in the PDB have missing coordinates for inserted and terminal regions, as they are usually highly flexible or disordered. It has been shown that MDFF can be used to improve sampling and overcome conformational traps that can befall Rosetta by combining Rosetta and MDFF in an iterative protocol .
In certain situations, automated protocols using MDFF fail to produce a final fitted model with a good quality of fit to the cryo-EM density. For example, an initial model may be required to undergo a large conformational change during the course of the fitting, which will take certain parts of the structure through regions of the density to which it does not belong. Due to the nature of the forces computed from the cyro-EM derived potential, the structure will try to fit to those incorrect regions of the density because the atoms are attracted in the present MDFF procedure to density indiscriminately as the density-induced potential does not incorporate further information, like contextual information. To avoid this problem of becoming trapped, MDFF can be run interactively.
The interactive feature allows a user to manipulate the target structure during the MDFF simulation by manually pulling it to the desired regions of density [90, 22]. Forces are applied to selected atoms or residues using a VMD session that is connected to the running MDFF simulation in NAMD. After selecting the atom or residue in the VMD display window, a user drags the cursor in the desired direction to move the selected component. A force acting on the selected component in the direction denoted by the user with a magnitude relative to the distance the cursor moved is then added to the MD simulation.
Density maps may also contain noisy, washed-out regions in which proper placement of the structure may be ambiguous. Interactive MDFF can be employed to integrate user expertise into the fitting process for ambiguous cases where automated MDFF may fail. For example, interactive MDFF has been used in modeling the ribosome to remove clashes amongst protein and RNA atoms . Particularly helpful in such cases is a new parallel implementation of cross-correlation analysis in VMD  (Section 5) that provides a real-time quality of fit estimate during interactive simulations (Fig 3B). A new MDFF graphical user interface released in VMD 1.9.2 (Section 2.4) makes setting up, running, and analyzing MDFF simulations for interactive usage easy.
An important step in any hybrid fitting method like MDFF is the evaluation of the the final structure. The quality of the strucure itself is commonly measured using MolProbity  which assesses structure quality based on various statistics such as Ramachandran and rotamer outliers and steric clashes. Another important factor in judging the quality of an MDFF-derived structure is its fit to the cryo-EM density. One of the most common scoring methods in this regard is the cross correlation coefficient between experimental density map and a density map calculated from the fitted structure.
A single global correlation value for the entire protein can be calculated, giving a coarse overall evaluation of the model. However, global cross correlation analysis is prone to producing false-positive information and the result is inherently degenerate: two different structures can be fitted into a density map that produce the same cross-correlation coefficients (ccc). Even though the ccc can be close to 1 (perfect fit), the fitted structure could be locally distorted. Instead of just using the global ccc, it is often useful to calculate the correlation at a finer decomposition, for example per-residue, revealing which parts of the structure are fitted well and which parts might require additional fitting or investigation. Fast parallel CPU and GPU algorithms make the routine computation of local ccc feasible, even for large structures and long fitting trajectories . The Timeline analysis plugin of VMD (Fig. 4) can be used to quickly calculate and visualize local cross correlations of MDFF trajectories . In the default analysis scheme, the structure is split into contiguous sections of secondary structure which are used to calculate local cross correlations for each section independently. The heatmap-style 2-D matrix plot provided by Timeline displays the time dimension horizontally and the structure-component dimension vertically (e.g., residues). The plot is zoomable and connected interactively to the VMD display window, so that when a specific residue at a specific frame has just been selected, VMD will display the neighborhood of the residue in this frame and permit any VMD style viewing of the structure. The features described allow users to quickly inspect the correlation analysis to locate poorly fitted parts of the structure. Additionally, Fourier shell correlation (FSC) which is often used to calculate the resolution of a cryo-EM density, can also be used to judge the quality of a model-to-map-fit. In case of the gold standard method of FSC, which utilizes two independent half maps , a cross-validation protocol can be followed during model refinement [95, 96] in order to identify possible overfitting. This protocol involves fitting a model to one half map while calculating the real-space or Fourier shell correlations with respect to the other half-map in a manner that is similar to the Rfree concept in crystallography .
There are many different strategies that can be employed to fix poorly fitting regions of a structure. Such strategies include varying simulation parameters such as lowering the scaling factor or increasing the temperature to improve sampling, using a different fitting methodology such as interactive MDFF or others discussed in the preceding sections, and selectively fitting regions of low local cross correlation while restraining other regions. However, the user is cautioned that poor fitting may be the result of missing or poorly resolved densities in the map, which limits the refinement of the structure.
Even though high-resolution cryo-EM data are becoming more readily obtainable, resolution is not always uniform throughout a map. Flexible regions of the structure still produce local resolutions lower than that of the overall map, as evaluated with tools such as ResMap . MDFF protocols can be adjusted to account for such local variations and better inform the process of model validation. For example, if ResMap analysis shows that certain residues reside in low-resolution regions of the density, the per-atom weighting factor applied to the forces derived from the density can be lowered. Adjusting to a lower weight reduces the strength of the coupling to the map, allowing for greater flexibility and improved conformational sampling of the low-resolution region of density. Local resolution analysis can be especially important for determining the parts of a high-resolution map that realistically contain side chain information and the parts that do not, preventing over-interpretation of the latter.
MDFF is a simple and intuitive, yet powerful, method for fitting structures to EM maps while respecting inter-atomic interactions which determine local structure. In this article, we have laid out characteristics of structures and density maps that users must be mindful of in practical applications of MDFF. The corresponding strategies provided are by no means exhaustive and further development of MDFF will likely improve the accuracy and applicability of the method.
As a tool for combining structural and density information, MDFF can easily be adapted to address a variety of circumstances faced in hybrid structure analysis. A large part of MDFF's versatility is inherited from known MD techniques (restraints, solvent models, interactive MD) as well as traditional modelling approaches (homology, de novo, structure prediction). Thus, developments in MD and molecular modelling in general present opportunities to improve MDFF to suit the needs of the structural biology community.
Future development in MDFF will also proceed in tandem with advances in the cryo-EM field. In particular, the increasing resolution of EM maps presents a challenge to MDFF. At the time of MDFF's inception, cryo-EM maps were typically in the low resolution range of 10 – 25 Å. Advances in the cryo-EM field over the years have led to dramatic increases in resolution. Most recently, high-resolution EM maps (< 5 Å) have emerged [1, 2, 3, 4, 5], placing cryo-EM technology alongside X-ray crystallography at the forefront of molecular structure determination. For example, cryo-EM maps of TRPV1 (3.4 Å) and β-galactosidase [2, 3](3.2 and 2.2 Å) have been resolved and used to construct atomic structures de novo.
The emergence of high resolution cryo-EM maps poses a challenge to MDFF. If maps have such high resolutions as to enable the construction of accurate atomic structures de novo [99, 100, 101], then structure refinement techniques like MDFF would become unnecessary. However, the production of high resolution maps continues to be a difficult undertaking and the resolutions of produced maps will continue, in the foreseeable future, to fall within a broad spectrum, only a fraction of which being amenable to straight interpretation by atomic resolution structures. Furthermore, EM maps, including high-resolution ones, typically do not have a uniform local resolution throughout the subject molecule, so that low-resolution regions, such as flexible exterior segments, will still require refinement by means of MDFF or other methods.
An adaptation of MDFF for high-resolution EM maps is presently being developed. The protocol for fitting to high-resolution maps involves the use of a low-pass or Gaussian filter that can be applied to the cryo-EM density map to smooth the resulting potential energy function that acts in the MDFF simulation, as previously proposed in . Such a filter removes steep wells commonly found in high-resolution densities in which the structure can become trapped. A structure fit to a smoothed density can then be used in a subsequent MDFF simulation using the original, high-resolution, map for further refinement. The ease of conceptualizing and implementing such adaptations is an important feature that underpins the usefulness of MDFF.
Instruction on the practical usage of MDFF and all of the features discussed in this article is available in a series of tutorials. The basic MDFF method (Section 2) in vacuum and in explicit solvent (Section 2.3), application of restraints (Section 2.2), the MDFF graphical user interface (Section 2.4), interactive MDFF (Section 4), xMDFF (Section 3.1), and Timeline analysis (Section 5) are covered in the tutorial found at http://www.ks.uiuc.edu/Training/Tutorials/science/mdff/tutorial_mdff-html/. The tutorial for structure checking (Section 2.1) can be found at http://www.ks.uiuc.edu/Training/Tutorials/science/structurecheck/tutorial_structurecheck-html/. The tutorial for the combined use of ROSETTA and MDFF (Section 3.2) to build complete initial models can be found at http://www.ks.uiuc.edu/Training/Tutorials/science/rosetta-mdff/rosetta-mdff-tutorial-html/. Additional tutorials covering more general topics including use of NAMD, VMD, and Timeline can be found at http://www.ks.uiuc.edu/Training/Tutorials/.
This work has been supported by grants NIH 9P41GM104601, NIH 5R01GM098243-02, and NIH U54GM087519 from the National Institutes of Health. The authors also acknowledge the Beckman Postdoctoral Fellowship program supporting A. Singharoy.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.