Methods for NMR structure determination typically rely on obtaining resonance assignments by establishing correlations between neighboring atoms, followed by measuring a series of restraints (e.g. distances, orientations) for each assigned site, that are used in structure determination by simulated annealing. In recent years, methods for simultaneous assignment and structure refinement (SASR) have been developed for both solution NMR and solid-state NMR.
For solution NMR, several such methods rely on backbone residual dipolar couplings (RDC) measured from weakly aligned samples, in combination with backbone chemical shifts, to define and connect structured fragments of a protein in a sequence-specific manner [
1], to obtain backbone resonance assignments from a known protein structure [
2], or to determine the three-dimensional arrangement of protein-protein complexes from the pre-determined structures of the individual components [
3,
4]. Alternatively, it is possible to generate low-resolution structures of globular proteins by fitting unassigned NMR data (e.g. chemical shifts, NOEs, RDCs) to computationally predicted structural models using a Monte Carlo procedure [
5]. Finally, methods have been developed to compute realistic spatial proton distributions for proteins in solution, solely from experimental NOE data with minimal assignments [
6-
8].
For solid-state NMR, the direct correlation between protein structure and the orientation-dependent dipolar coupling (DC) and chemical shift anisotropy (CSA) frequencies, measured in samples with uniaxial order [
9-
11], provides a method for SASR based on minimizing the difference between the experimentally observed spectral frequencies and the frequencies back-calculated from a structural model. Because such solid-state NMR spectra display full, or near full, magnitudes of the DC and CSA, the order tensor is known
a priori, and their interpretation is significantly facilitated. The SASR approach relieves the burden of having to obtain near complete resonance assignments prior to structure determination: resonance assignments are obtained as a side product of fitting a structural model to the NMR data, but is not a prerequisite for structure determination.
Uniaxial order can be achieved by either inducing sample alignment relative to the magnetic field (B
o), as in oriented sample (OS) solid-state NMR [
12,
13], or by exploiting the inherent uniaxial rotation of a protein relative to an internal principal axis in a non-aligned sample (e.g. [
14-
16]). Since the direction of order is fixed by the sample geometry, the resulting NMR frequencies provide not only precise internal restraints for structure determination, but also relative restraints that enable the structure to be positioned in the context of the alignment medium. This is particularly useful for membrane proteins in lipid bilayers where structure determination also yields the three-dimensional position of the protein within the membrane [
17-
21]. For membrane proteins embedded in phospholipid bilayer membranes, the direction of order is determined by the membrane preparation, which can consist of planar lipid bilayers supported on glass or aligned magnetically, or of spherical vesicles where the protein undergoes rotational diffusion around the lipid bilayer normal (n).
In the first applications of SASR to α-helical membrane proteins, the
1H/
15N separated local field (SLF) spectra obtained from combinations of selectively
15N-labeled (by residue type) and uniformly
15N-labeled (all residues) proteins were assigned by comparison with the spectra calculated from a structural model, and the assigned experimental frequencies were then used to either directly calculate backbone dihedral angles [
18], or as orientation restraints in a simulated annealing protocol [
22], to obtain a final membrane-oriented structure consistent with the data. Alternatively, an algorithm has been described to build structural models from random assignments of SLF data and comparison of the data back-calculated from each structural model with the experimental data [
23]. Furthermore, a method based on graph theory has been developed to simultaneously obtain structure and assignment of
1H/
15N SLF spectra [
24]. These two approaches were developed specifically for a-helical proteins, although they should also be applicable to other regular secondary structures.
In a recent application of SASR to a β-barrel outer membrane protein, the
1H/
15N SLF spectrum of
15N-Phe-labeled OmpX in magnetically oriented lipid bilayers was assigned through an iterative approach where each of the possible peak assignment combinations was tested for its ability to provide
1H/
15N DC and
15N CSA orientation restraints, consistent with the proper spatial orientation of the crystal structure within the membrane and with its associated back-calculated spectrum [
25]. Although powerful, this type of analysis can quickly evolve into a complicated problem when the number of assignment permutations to be tested is very large, since for n number of peaks there are n! assignment permutations. For example, there are 5040 (7!) ways to assign the 7 Phe peaks in the SLF spectrum of selectively
15N-Phe-labeled OmpX and, while the task can be alleviated by further subdividing the spectrum into separate sets of peaks according to their H/D exchange [
25], or other properties, such simplifications are not always possible.
Here we present a computer program, AssignFit, developed within the XPLOR-NIH package [
26], that greatly facilitates the SASR process. Unlike the first applications of SASR to α-helical [
18] or β-sheet [
25] membrane proteins, where the potential assignment permutations were generated by hand and analyzed with the aid of home-developed FORTRAN code, AssignFit generates all permutations computationally and tests them for best fit to the data.