|Home | About | Journals | Submit | Contact Us | Français|
Roughly one-third of all proteins reside in biological membranes [1, 2]. Integral membrane proteins (IMPs), which can only be released from the membrane by disruption of the membrane, perform a host of vital cellular functions as receptors, transporters, channels, electrical and photo-transducers, and so forth [3-5]. It is therefore not surprising that mutations in membrane proteins are linked to many diseases and that IMPs represent well over 50% of the targets for existing drugs [6-9]. In spite of the importance of IMPs, the structural biology of this class of proteins remains underdeveloped. As of February 2009 only 1.7 % of the structures deposited into the RSB Protein Data Bank were IMPs based on the searches performed by the PDBTM (pdbtm.enzim.hu) and OPM (opm.phar.umich.edu) [10, 11]. IMPs can be classified based on the dominant secondary structures of their transmembrane domains [10, 11], where the number of IMPs of known structure that utilize α-helical spanning elements clearly outnumbers the number of β-barrel proteins by roughly 4:1 (http://pdbtm.enzim.hu and http://opm.phar.umich.edu). Currently deposited structures also show a clear bias regarding the source organism, with 70% from prokaryotic organisms and 30% from eukaryotic organisms (based upon the PDBTM holdings for non-redundant, experimentally determined structures containing at least one transmembrane element).
The total number of integral membrane protein structures determined has accelerated during the decade leading up to 2009, thanks largely to the accomplishments of X-ray crystallography. However in just the past three years, there has been remarkable progress in the application of solution NMR methods to IMPs. Moreover, there is reason to believe that this recent progress reflects only the beginning of a phase of exponential growth in the use of solution NMR methods to solve important problems in membrane protein structural biology. Here, we build upon previous reviews from this laboratory [12, 13] to highlight recent progress in the application of solution NMR of IMPs and to outline sample preparative and spectroscopic advances that have led to this breakthrough. The authors also note with admiration that progress in solution NMR applications has been paralleled by truly remarkable advances in the application of solid state NMR methods to IMPs, as recently reviewed by others [14-19].
The aggregate molecular weight of an IMP-model membrane complex offers some insight into the challenge that a solution NMR structural effort will present under the most favorable circumstances. The inclusion of the necessary amphiphiles (usually detergent) to allow for solubilization and proper folding of a membrane protein in a polar aqueous environment will typically double or triple the effective size of the protein-model membrane aggregate. Thus, while a 200 residue globular protein may be a very tractable target for solution NMR, a helical membrane protein of the same size is a markedly greater challenge. Furthermore, the tendency of many membrane proteins to form oligomers can add to the effective size of the ensemble. Another significant factor to consider in assessing the feasibility of NMR studies for a given membrane protein is the spectral dispersion. Spectra from helical membrane proteins will typically span only a narrow proton frequency range, leading to difficulties with spectral resolution that are much more severe than for spectra from β-barrel membrane proteins of a comparable size. With these various considerations in mind, we can attempt to assess the percentage of membrane proteins that are likely to be tractable to characterize by solution NMR methods.
The α-helical and β-barrel IMP superfamilies [10, 11] each presents their own unique challenges for structure determination; however, the focus of this review will be directed towards helical membrane proteins. The von Heijne and Rost groups conducted pan-genomic analyses of the topologies and properties of integral membrane proteins [1, 3, 20, 21]. In their analysis of the predicted IMPs from a variety of eubacterial, archaean, and eukaryotic organisms, the majority of genes predicted to encode membrane proteins involve only a limited number of predicted transmembrane (TM) segments (50% contain less than 4 TM elements). The number of IMPs goes down as the number of TM spanning elements goes up, with an exception being the prokaryotic bias in favor of 6 and 12 transmembrane-spanning proteins, many of which function as transporters. Within eukaryotic organisms, a similar preference for 6 and 12 TM proteins is not observed; however, a slight bias can be seen for 7 transmembrane proteins, which include the well-known G-Protein Coupled Receptor (GPCR) family. GPCRs and other membrane proteins of similar topological complexity typically give rise to monomer molecular weights in the vicinity of 40 kDa. For such proteins, the expected aggregate molecular weight of a protein-detergent complex is expected to be in the range of 100 kDa. With current solution NMR technology, we suggest that membrane proteins of this size and topological complexity are within the size limit of tractability to NMR-based backbone structural determination. If 7 TM helix proteins are considered to approximate the current upper limit to NMR-based structural determination of membrane proteins, this suggests that approximately 75% of all IMPs can be regarded as tenable targets for solution state NMR, assuming that the proteins can exist in a biologically relevant conformation as a monomer. While the actual percentage will be considerably lower owing to protein oligomerization and other factors, a significant fraction of all membrane proteins should nevertheless remain as tractable targets for solution NMR-based structural analysis.
Of course, size is not the only factor that determines whether a given membrane protein is a suitable target for NMR. A prerequisite to any protein structural study is the need to prepare pure and properly folded protein at levels that are suitable for the technique of interest. For membrane proteins, this poses a formidable hurdle to the initiation of structural studies, and one that must be overcome in a heuristic and protein-by-protein basis, as outlined in the following sections. [22-27].
An ideal recombinant IMP expression system for structural biology would yield tens or hundreds of milligrams of natively folded, fully functional protein that could be readily purified into model membranes. This is difficult to attain for most IMPs, especially those of eukaryotic origin. While high level preparation of IMPs from a native source offers the best chance to observe the protein in a properly folded state (as is feasible in the case of bovine rhodopsin), most native sources fail to produce quantities of membrane protein sufficient for NMR and may pose significant barriers to the incorporation of stable isotopes for an intensive NMR study. Thus obtaining recombinant IMPs using exogenous expression is often a necessity in order to obtain adequate levels of protein production for structural studies. A number of reviews of IMP expression and sample preparation are available [22, 26, 28-45]. In the following section, several of the common expression systems for producing viable solution NMR samples will be explored, which are summarized in Table 1.
Because of the increasingly sophisticated stable isotope labeling schemes being utilized by NMR spectroscopic studies of large and/or membrane proteins, E. coli is typically the expression host of choice [24-27, 30, 32, 46-50]. Other advantages of utilizing E. Coli are found in the ease of genetic manipulation for incorporating and controlling protein expression, the rapid growth rates, and the scalability of culturing for mass production in order to meet the sample demands of an intensive structural study. There exists a wide availability of E. coli expression vectors and specialized cell strains designed to enhance expression of recombinant protein and to incorporate the stable isotopes necessary for NMR characterization. Many prokaryotic and some eukaryotic membrane proteins including GPCRs have been over-expressed in E. coli (see preceding citations). Nevertheless, there are several significant drawbacks to consider in the utilization of bacterial expression systems, which arise from differences in the pathways of membrane protein biogenesis observed for prokaryotes and the more complex eukaryotic organisms [31, 51]. These differences include preferences for codon usage, post-translational modifications, and membrane lipid compositions. The expression of eukaryotic targets in prokaryotic expression hosts frequently leads to the formation of inclusion bodies instead of targeting the expressed protein to the membrane, necessitating additional steps to obtain functionally active protein. We have found that E. coli expression of IMPs can often be enhanced both in terms of protein yields and membrane targeting by inducing expression at low temperature , while others have focused on developing special media tailored for expression of difficult proteins .
An important recent innovation in E. coli expression has been the development of the “single protein production” (SPP) expression method by Inouye and co-workers . This method allows cells to be manipulated just before induction so that all mRNA sequences except that encoding the protein of interest are degraded. As a result, only the protein of interest is expressed, permitting the cell to focus all resources on producing the targeted protein. Under these conditions the cells, ideally, remain alive but are not dividing. This leads to one of the most appealing variations of this method from a labeling standpoint, namely the ability to pellet the living cells from unlabeled media and then resuspend the pellets in modest volumes (>5-fold reduction relative to the starting culture) of isotopically labeled media immediately prior to induction. This should dramatically lower the cost of both uniform and selective labeling of proteins by dramatically reducing the amount of required medium to achieve a given biomass. An additional feature of SPP-based isotopic labeling is the possibility of producing a labeled target protein in the context of conditions where all other cellular proteins are unlabeled, potentially enabling direct in vivo NMR studies of the protein of interest.
A second recent innovation is in the emergence of the nisin-inducible, Gram positive lactococcal expression host as an alternative prokaryotic system for the production of membrane proteins . Because Gram positive organisms lack an outer membrane, the plasma membrane of such organisms may in some ways more closely resemble a eukaryotic membrane than the typical plasma membrane of E. coli. A number of prokaryotic and eukaryotic membrane proteins have been expressed in Lactococcus lactis [40, 54], with a number of the proteins observed to maintain their specific transport or binding activities. This system is promising for the production of functional membrane proteins that elude high level functional expression in E. coli. While not yet used to produce NMR samples, many of the stable isotopic labeling protocols developed on E. coli should be readily adaptable to Lactococcus lactis.
Yeasts represent an attractive eukaryotic alternative to bacterial expression systems. Two primary yeast strains have been exploited for heterologous recombinant protein expression, Saccharomyces cerevisiae and Pichia pastoris. Both offer significant benefits by allowing for some native-like post-translational modifications found in higher eukaryotes, the processing of eukaryotic secretion signals, and the handling of proteins that contain multiple disulfide bonds . There are several reports of the successful expression of functional mammalian receptors and other complex membrane proteins in yeast hosts [56-60]. Direct comparison of the ability to express an array of 104 different receptors between E. coli versus P. pastoris host cells revealed that all but six of the 100 receptors tested could be expressed in the Pichia system at immune-detectable levels, whereas in E coli, only 46 out of the 101 receptors exhibited detectable levels of expression .
Similar to E. coli, methods for the yeast incorporation of 15N and 13C have been reported [61-67]. Perdeuteration of proteins is also possible using Pichia, although the application of selective isotopic labeling is more complex in methylotrophic yeast than in simpler bacterial systems . Continued exploration in conjunction with further development of auxotrophic strains may make selective labeling schemes possible for these eukaryotic systems .
Higher complexity eukaryotic hosts have also been used for producing protein samples for NMR characterization. These systems include baculovirus-infected insect cells and transfected mammalian cells. The major benefit of these systems is found in the presence of integral components which allow for proper folding and post-translational modification of eukaryotic membrane proteins. The baculovirus system also has the additional distinction of being the expression technique utilized for three of the recent G-protein couple receptors solved by X-ray crystallography [70-72]. The recent commercial availability of isotopically labeled media for insect cells has greatly simplified the task of producing proteins labeled for NMR study; however, the low yield and high cost of obtaining labeled recombinant protein may prohibit the routine application of these systems . Additionally, the successful selective labeling of several amino acid types in insect cells has been reported [74, 75]. However, these cells have not been yet been adapted to enable production of perdeuterated protein, which limits the use of insect cells to prepare optimally-labeled samples for NMR studies of large proteins and complexes.
Mammalian cells have potential as an expression host for isotopically labeled mammalian IMPs in their native state. There are several reports of the successful 15N and 13C labeling of bovine rhodopsin in CHO and HEK293 cells [76-80] and media for incorporating 15N and 13C into expressed proteins is commercially available. Recently, uniform 15N/13C labeling of bovine rhodopsin was carried out in HEK293 cells using a commercial medium, yielding slightly over 2 mgs per liter of culture. This paved the way for initial solution NMR studies of its C-terminus in detergent micelles . In general, protein perdeuteration is not yet practical using mammalian cell expression systems.
Recently, cell-free systems have emerged as a promising alternative for preparing large quantities of isotopically labeled membrane proteins [25, 26, 30, 33, 81, 82]. The application of a cell-free system offers some advantages over conventional expression of membrane proteins: (i) this approach is independent of cellular integrity, (ii) many different conditions can be tested to optimize expression levels in a short period of time, (iii) labeled proteins in the reaction mixtures can be directly analyzed by NMR, and (iv) application of novel stable isotope labeling schemes can facilitate resonance assignments [83, 84]. Membrane proteins expressed by cell-free systems include 2-10 TM E. coli transporters and channels, an E.coli β-barrel nucleoside transporter Tsx, several 7 TM human GPCRs, ion transporters, and a 12 TM tetracycline pump [30, 36, 82, 85].
Several important factors can influence the effectiveness of a cell-free system for the large-scale production of membrane proteins. Because cell-free systems usually involve isolated transcription and translation systems, the accessory systems designed to facilitate protein folding and membrane integration in the case of membrane proteins are often not present. Moreover, expression of the membrane proteins in a membrane-free environment frequently results in rapid precipitation, necessitating subsequent refolding . However by including carefully-selected detergents in the reaction mixture, cell-free systems have been shown to be capable of producing substantial amounts of folded membrane proteins [26, 27, 30, 33, 36, 44, 82]. Additional work has shown that inclusion of liposomes prepared from bacterial membranes and other bilayer systems may sometimes offer an alternative to micelles for membrane protein membrane insertion and folding [85, 86].
Since most cell-free systems are simply a coupled RNA/protein biosynthesis system lacking the amino acid biosynthetic systems exploited in conventional labeling strategies, stable isotopes must be introduced in the form of labeled amino acids, which can create a significant cost to preparing uniformly labeled samples relative to conventional E. coli biosynthetic labeling. However, the recent commercial availability of labeled amino acid mixtures has drastically reduced the cost. Previous attempts at perdeuteration in cell-free systems have typically resulted in low, non-uniform levels of deuteration; however, 95% deuteration of the 800 kDa chaperonin GroEL was recently achieved through the preparation of a perdeuterated E.coli S30-extract, D-S30 . The cell free approach to preparing highly perdeuterated protein with specifically protonated amino acids also provides routes for the application of methyl-protonated labeling schemes and the use of stereo-array isotopic labeling (SAIL), albeit at an additional cost [83, 88, 89].
While cell-free systems involving bacterial extracts have been the most extensively used and tested, cell-free systems have also been developed using extracts prepared from alternative cell lines, such as wheat germ, insect cells, and rabbit reticulocytes. Though these eukaryotic-derived systems are generally less well-developed, the benefits of improved translation of eukaryotic genes and the potential for eukaryotic post-translational modifications through the addition of microsomes suggest that these systems are worthy of consideration and further developmental effort .
Once acceptable expression levels of a target membrane protein have been attained, the recombinant protein must be purified from the endogenously expressed proteins and solubilized in an NMR-compatible membrane mimetic. There are two general approaches, the choice of which can be strongly influenced by the localization of the expressed protein. The first is especially well-suited for membrane proteins that are expressed into inclusion bodies and involves initial solubilization with a harsh detergent, such as SDS, a concentrated chaotropic agent such as urea, or a combination of the two [22, 46, 91]. Solubilization is followed by purification, during which a switch is made to non-denaturing detergents or detergent-lipid mixtures, at which point the protein of interest refolds. The second approach is for IMPs that are properly inserted and folded into the membrane during protein expression. In this case the expressed IMP can be extracted and solubilized using a mild detergent, which is typically retained through all subsequent steps of purification. In this case, the challenge is to find a detergent that is capable of effectively extracting the protein from the membrane and yet is mild enough to preserve the native fold. Though alkyl glycosides such as dodecylmaltoside have long been favored to prepare samples for crystallography, in the case of NMR, lysophospholipids such as LMPC and LMPG may be superior because of their stronger solubilizing power and because of the favorable quality of the NMR spectra that are often obtained [13, 24, 91-93]. Lauryldimethylamine oxide (LDAO) and alkylphosphocholines such as DPC may also sometimes be used, though they are harsher than the lysophospholipids and should therefore be used judiciously.
For both of the above approaches the use of His6-10-tagged recombinant protein offers a convenient route to on-resin immobilization for the purposes of purification as well as to facilitate exchange from the detergent solution used for extraction to the final model membranes for actual NMR experiments .
In a few cases, successful NMR studies of IMPs have been carried out in organic solvent mixtures [94-97]. However, this approach has general disadvantages, as we have previously reviewed [12, 13]. The most commonly used medium for solution NMR studies of membrane proteins is detergent micelles (Fig. 1). We and others have previously reviewed the practical aspects of choosing detergents for solution NMR studies [12, 13, 98] so that only recent results are summarized here. While optimization of sample conditions and choice of detergent still requires an empirical and protein-specific approach, it appears that there is a “short list” of detergents established through trial-and-error that seem most frequently to yield favorable conditions for NMR of IMPs, which can be gleaned from the detergent column of Table 1 and the third row for the IMP structures shown in Fig. 7. Recent biophysical studies of detergent-membrane protein complexes are also beginning to provide insight into why some detergents yield better NMR spectra and preserve membrane protein function better than others. Columbus and co-workers have demonstrated the importance of finding a good match between the transmembrane span of an IMP target and the diameter of the micelles being employed [34, 99]. Their work also demonstrated a powerful combined NMR/small-angle X-ray scattering (SAXS) approach that offers much promise as a route to obtaining definitive information regarding protein-detergent mixed micelle size, shape, and spectroscopic suitability.
A recent report from MacKenzie and co-workers has demonstrated that the addition of very modest amounts of phospholipids to micelles can result in dramatic enhancements of NMR spectral quality for some integral membrane proteins . This lipid dependence appears to reflect the requirement of some membrane proteins for semi-specific lipid-protein interactions, which cannot be satisfied by detergents only. A second contributing factor may be found in the ability of even modest amounts of lipids in lipid-detergent mixed micelles to alter the structural and dynamic properties of the amphiphilic assembly in a way which leads to spectroscopically-favorable dynamics and stability for the included IMP. A number of membrane proteins have previously been shown to require the presence of bona fide lipid in order to maintain functionality and native-like structure [101-104]. Shown in Fig. 2 are 1H-15N TROSY spectra from Poget and Girvin of the 4-TM span multi-drug resistance transporter, Smr, acquired in three different membrane mimetics: LPPG, decylmaltoside (DM), and DHPC/DMPC isotropic bicelles. The spectra in DM and in isotropic bicelles are similar and exhibit much higher spectral dispersion than LPPG conditions . In terms of function, the bicellar sample exhibited optimal binding of a drug, tetraphenylphosphonium, with DM exhibiting reduced affinity to the drug, while LPPG did not support binding [101, 106].
The membranes of higher eukaryotes sometimes contain very high levels (up to 40 mol%) of the lipid cholesterol  and a number of IMPs require cholesterol to maintain full functionality [108, 109]. However, cholesterol is extremely difficult to solubilize in detergent micelles; even commercially available derivatives such as cholesterol hemisuccinate and cholesterol sulfate are difficult to co-solubilize with the detergents most commonly employed in NMR studies. Recently, a new compound, β-CHOBIMALT, has been introduced that is freely soluble in detergent micelles . β-CHOBIMALT is comprised of cholesterol that has been glycosylated at its hydroxyl head group with a tetrasaccharide. β-CHOBIMALT may prove useful as an additive to conventional micelles for use in studies of eukaryotic IMPs. This compound was recently employed in NMR studies of the critical transmembrane C-terminal domain of the amyloid precursor protein, which led to the proposal that this protein is a cholesterol binding protein and may serve as a cholesterol sensor to regulate cellular cholesterol biosynthesis and uptake .
In the continued pursuit of better membrane mimetics, a new series of detergents were recently introduced that explored the derivatization of the alkylphosphocholine class of detergents to include a polar spacer between the headgroup and the acyl chain . This modification led to detergents that resemble lyso-phosphatidylcholine, but lack the ester moiety that is a source of chemical instability in glycerol-ester-based lipids and detergents . In the study by Zhang et al., the compounds that yielded the best NMR spectra of OmpX were seen to be those that best-resembled bona fide lyso-phosphatidylcholine by including a linker containing both H-bond donating and accepting moieties between the alkyl chain and the phosphocholine headgroup.
Bicelles have emerged as a common medium for use in NMR studies of IMPs. Bicelles are binary detergent-lipid mixtures that assemble into bilayered, water soluble assemblies (Fig. 1) [116, 117]. A number of bicelle systems have been developed and characterized for their unique liquid-crystalline phase behavior [111, 118-126]. Many of these systems represent adaptations of the originally-characterized bicelle systems, which are composed of mixtures of the lipid dimyristoylphosphatidylcholine (DMPC) and a detergent, either dihexanoylphosphatidylcholine (DHPC) or the zwitterionic bile salt derivative CHAPSO [118, 125, 127, 128]. Recent focus on the morphology of bicelles has revealed a greater complexity than originally proposed [117, 129-131]; nonetheless the original bilayered disc model appears to be applicable at the compositions typically employed in solution NMR studies of IMPs. For reasons that have not yet been fully explored, isotropic DHPC-DMPC bicelles typically produce much better solution NMR spectra of IMPs than CHAPSO-DMPC bicelles and nearly all published solution NMR studies involve the well characterized DHPC-DMPC system (or their ether-linked analogs).
Bicelle composition can be described by the parameter q, the molar ratio between the lipid and the detergent above the critical micellar concentration (CMC):
This revised definition of q reflects an extension of the earlier, simpler definition of q as the molar ratio between the lipid and the total detergent . The new definition takes into account the fact that when the overall amphiphile content of the sample is low (lipid + detergent < 5% w/v) then the concentration of free detergent in solution can be a significant fraction of the total detergent present. This new definition offers a more realistic description of the true detergent-to-lipid ratio in the bicellar assemblies themselves.
DHPC-DMPC bicelles used in solution NMR studies of IMPs are typically lipid-poor and detergent-rich (“low q” conditions), with q in the range of 0.25-0.5. Above q = 0.5, assemblies are expected to be too large to yield well-resolved spectra from IMPs, while at q below 0.25, the distinction between true bicellar morphology (i.e., bilayered discs) and conventional lipid-detergent mixed micelles becomes blurred. Low to moderate q mixtures have been subjected to considerable characterization [106, 132-136].
NMR studies have typically been carried out at temperatures in the range of 30-45°C. In addition to the choice of q and temperature, an important parameter to consider when exploring bicelles for solution NMR studies is the bicelle:protein ratio. To avoid non-native IMP interactions and/or aggregation, it is desirable to have no more than one protein solubilized within a single bicelle, a principle long known to apply for NMR studies involving conventional micelles .
Isotropic bicelles have recently been employed as the medium for NMR-based structural studies of a number of IMPs [138-144], including a tetraspan helical IMP and an OMP [101, 106]. These eye-opening studies suggest that one should not be deterred by the larger size of bicelles relative to most micelles.
Bicelles have been used to functionally reconstitute a variety of membrane proteins [101, 123, 145] and also to avoid micelle curvature-induced structural perturbations of IMPs [101, 123, 146-148]. The recent determination of the structure of the heterodimeric transmembrane domain of the αIIbβ3 platelet integrin in bicelles provides an elegant example of using this medium to solve an important structural biological problem that proved elusive when conventional micelles were used . The αIIb and β3 subunits contain a single transmembrane span each, which are believed to directly associate in the signaling-off state and dissociate in the signaling-on state, a model that suggests the binding energy driving association of these helices is modest. Early NMR and other biophysical studies of the transmembrane domain conducted in DPC micelles failed to detect any interaction between subunits . However, a moderately stable heterodimer forms in isotropic bicelles, leading to determination by the Ulmer lab of its long-sought structure . Apparently, DPC micelles destabilize the heterodimer to the point where interaction cannot be detected, while the environment provided by bicelles allows at least partial retention of native-like heterodimer avidity.
Finally, bicelles provide a medium that allows both solid state and solution NMR to be carried out in mixtures that are similar in composition, differing only in the q ratio used. The utility of “high q” bicelles for solid state NMR, X-ray crystallography, or solution NMR (typically limited to mobile membrane-associating polypeptides) is described elsewhere [105, 123, 142, 143, 145, 150-160].
We have previously reviewed the use of lipopeptides, reversed micelles, amphipols and fluorinated surfactants in NMR studies of membrane proteins (Fig. 3) [12, 161]. Here, we focus on recent progress made in the application of alternative model membrane systems.
About a decade ago, the Wand group introduced the use of reversed micelles system as a means to overcome size limitation in protein NMR [162-166]. This approach exploits the reduced correlation time for a protein/reversed micelle complex in a low viscosity organic solvent, which can be markedly shorter than for either the free protein or for the micelle-encapsulated membrane protein under aqueous conditions (Fig. 3C). The use of reversed micelles was recently extended to a model membrane protein, gramicidin A, by Flynn and coworkers . In their work dioctylsulfosuccinate (AOT)-based reverse micelles were solubilized in liquid pentane. Gramicidin A in reverse micelles yielded NMR spectra which are nearly identical to spectra of the protein in conventional SDS micelles. Moreover, interchain NOE contacts were preserved, suggesting retention of native homodimerization. Relative to conventional micelles, the low viscosity of pentane gave rise to narrow lines. In a recent application to homotetrameric KcsA, several types of reverse micelles were tested, with the optimum found to be a mixture of cetyltrimethylammonium bromide, dihexadecyldimethylammonium bromide, and hexanol in pentane. When solubilized in this mixture, KcsA has a correlation time of 10-15ns , indicating much more rapid tumbling than previously reported for KcsA in DPC micelles (60ns) or SDS micelles (40ns) [169, 170]. In the reverse micelle system, T2 relaxation times for the transmembrane core were found to average 80ms, significantly longer than observed for KcsA in classical micelles (20ms) and suggestive of the potential to apply conventional triple-resonance experiments for resonance assignment without the application of deuteration.
Another emerging tool in structural studies of membrane proteins is the use of amphipols in place of detergents (Fig. 3E). Amphipols are amphipathic polymers in which the backbone includes alternating hydrophilic and hydrophobic chains [161, 171, 172]. Amphipols have been shown to offer a route for purifying and stabilizing membrane proteins in the absence of conventional detergents or denaturants. Previously, a number of membrane proteins: bacteriorhodopsin, the bacterial photosynthetic reaction center, cytochrome b6f, OmpF, DAGK, and FomA, have been successfully incorporated into amphipols, under conditions where functionality is often preserved [171, 173, 174]. Recently, a report from the Baneres laboratory demonstrated the ability to refold several Class A GPCRs back to a functional state using amphipols . Attractive NMR spectra were acquired for OmpA in amphipol A8-35, which suggests these mixtures may, in favorable cases, have potential for use in structure determination of IMPs . Use of amphipols in these preliminary NMR studies, as well as applications in cryo-EM  and in refolding of IMPs [173, 175], suggest that the potential of amphipols as a tool in structural biology is just beginning to be explored.
For a number of years, the size limit for structural analysis by NMR was thought to be in the range of 30-40 kDa. This limit has now been exceeded, owing in large part to recognition of the TROSY phenomenon and the development of TROSY-based pulse sequences [46, 178-180], as we have previously reviewed . The application of experiments utilizing TROSY elements have led to assignable spectra even for proteins and complexes with molecular weights of 100 kDa or higher. An impressive example of TROSY to enable structural determination is represented in the studies by the Kay group on the 82 kDa monomeric soluble enzyme, malate synthase G . There exists a well-established correlation between the spectral improvements for amide 1H-15N pairs observed in TROSY-based experiments and the field strength due to the field dependence of the CSA relaxation mechanism, which is optimized at a proton frequency near 1GHz . Shown in Fig. 4 is a comparison of 1H-15N HSQC and TROSY-HSQC spectra of a 70 kDa IMP/micelle complex across a range of spectrometer frequencies, which clearly illustrate the improvement typically observed for the TROSY effect even in the absence of perdeuteration. The protein used for this set of spectra was the homodimeric 99 residue N-terminal transmembrane domain of the amyloid precursor protein (C99). At a moderate field (600MHz), only modest differences can be seen for the HSQC (Fig. 4A) and the TROSY (Fig. 4B) experiments. As the magnetic field is increased to 900 MHz, there is dramatic improvement in the TROSY spectrum. Table 2 represents a collection of membrane proteins for which structure or at least backbone resonance assignments have been obtained. For many of the proteins listed, the overall correlation time of the IMP/micelle complex is in the range of 20-40 ns, corresponding to aggregate protein-detergent molecular weights up to 120 kDa. For larger IMPs, the use of TROSY-based pulse sequences and magnetic fields of 700 MHz or higher have proven to be absolutely essential. It should be noted, however, that TROSY cannot eliminate line broadening resulting from the presence of conformational exchange on an intermediate time scale. The presence of such undesired dynamics should be suspected if the use of TROSY at 700 MHz or higher does not lead to a significant improvement in spectral quality for membrane protein/detergent complexes in excess of 30 kDa.
The labs of Kay and Bax have more recently demonstrated pulse sequences that select for the most narrow components of multiplets involving methyl or methylene protons coupled to aliphatic 13C [183-185]. The physical basis for the differential line widths seen in these multiplets differs from that giving rise to the classical TROSY phenomenon. Most notable among these pulse sequences is the 2-D “methyl-TROSY” experiment and its higher-dimensional analogs, which has been powerfully combined with special isotopic labeling patterns to enable side chain assignments and measurement of long range NOEs even for very large proteins and complexes [183, 184, 186, 187], usually in conjunction with ILV-labeling (see Section 4).
While the initial development and application of TROSY-based NMR experiments utilized protonated samples, the potential improvement of the TROSY effect can be fully realized only if non-labile side chain and backbone protons in the protein are biosynthetically replaced with deuterons. The benefit of perdeuteration arises principally from the ability to extend transverse relaxation times by suppressing dipole-dipole interactions between remote protons and between protons and directly attached heteroatoms . Perdeuteration also benefits multidimensional experiments which utilize extended coherence transfer schemes passing though aliphatic carbons, such as the Cα in the CT-HNCA experiment . Perdeuteration not only offers improved 13C linewidths but markedly extends 1H transverse relaxation rates, which ultimately results in higher spectral signal-to-noise when the time required to execute a pulse sequence is of the same order as T2. This is often the case in higher dimensional experiments involving 13C even when perdeuteration does not markedly improve the quality of the 2-D 1H-15N-TROSY spectrum, as is the case for DAGK in micelles. Because some expression hosts, especially higher eukaryotes are not currently compatible with culturing in D2O, the need for perdeuteration as a prerequisite for advanced NMR studies of larger membrane proteins must be factored into the planning for any IMP NMR study.
There are two additional factors to consider when embarking on perdeuteration of a membrane protein. First, the assumption that the extensive replacement of hydrogen with deuterium will have no effect on the structure and biological activity of proteins may not be correct as indicated by recent X-ray crystallographic results [189-191]. There also exists a body of biophysical evidence that incorporation of deuterons gives rise to predictable changes in the physiochemical behavior of soluble proteins, including significantly decreased thermal stability and pKa shifts for charged amino acids [192-196]. An example of the effect of perdeuteration on proteins in the membrane environment is provided by VP1, a small amphipathic peptide. The protonated form of the peptide is capable of adopting a helical conformation at the bilayer surface and then penetrating the membrane, whereas the perdeuterated peptide is unable either to form an amphipathic helix at the lipid interface or to insert into the bilayer . In the case of proteins that are prone to instability, the potential for additional destabilization as a result of perdeuteration should be taken seriously.
A second consideration regarding perdeuteration is the need to back-exchange amide deuterons for protons to enable detection at the amide proton site. However, the stability of well-packed transmembrane segments can hinder D→H back-exchange [186, 198], as was observed for the 2nd and 3rd transmembrane segments of DAGK. This necessitated that perdeuterated DAGK be denatured to facilitate exchange in these segments, requiring an elaborate refolding procedure .
A number of options have been forwarded to tailor deuteration patterns to facilitate NMR studies [188, 200-204]. For simple backbone assignments based on J-correlated experiments and for NOESY experiments focusing on exchangeable amide sites, perdeuteration will give the greatest benefit. In extending spectroscopy into the sidechains or in attempting to correlate to non-exchangeable protons, a 50% random fractional deuteration scheme offers a favorable compromise between preserving the non-exchangeable protons and depleting the proton dipolar bath .
Finally, as shown in the recent determination of the VDAC1 structure by the Wagner lab some NMR experiments will yield optimal results only when perdeuterated detergent also is used . Though generally applicable to any multidimensional experiment involving a large membrane protein in direct contact with detergent, a pronounced improvement may be seen for NOESY-based experiments, where loss of magnetization through spin diffusion to the detergent may severely compromise the already marginal signal-to-noise of long range intra-protein NOEs.
In addition to deuteration, a number of more advanced labeling strategies have been proposed to reduce line broadening and simplify spectra, both of which ultimately facilitate the process of resonance assignment. The use of specific 15N-labeled amino-acid selective labeling has long been a method to classify TROSY/HSQC peaks according to residue type. 15N-selective labeling strategies can be further applied as part of more sophisticated, interleaved combinatorial patterns to allow unambiguous assignment of all amino acid types—in this case only 5 samples that employ a recursive labeling pattern are required, rather than discrete labeling of all 20 amino acids [206, 207]. Another strategy involves the application of a combinatorial selective labeling method, where interleaved, multiple selective labels are prepared using specific 15N/13C and 15N/14N labeling patterns to yield a large number of residue and sequence specific backbone assignments as illustrated in the labeling pattern shown for five samples in Fig. 5A .
Recently, the application of an alternative approach, Stereo-Array Isotope Labeling (SAIL), has been shown to offer many of the benefits of perdeuteration through the depletion of the protons via selective labeling while retaining an adequate number of sidechain protonation sites to allow for the preservation of sidechain NOE contacts that are typically lost in conventional deuteration [49, 83, 89, 179]. Three representative amino acids utilized by the SAIL strategy are shown in Fig. 5B illustrating the dilution of proton sites with deuterons to facilitate the spectroscopy.
Within highly deuterated samples, the selective re-introduction of limited protonation has shown significant promise for fishing out long range NOE contacts [48, 201, 208, 209]. Most notable of these techniques is the application of selective ILV methyl protonation where specific protonation is accomplished by spiking an otherwise perdeuterated medium with partially protonated precursor compounds used in the biosynthesis of Leu, Ile, and Val . Two of these common precursors, [3,3-2H2] α-ketobutyrate and [3-2H] α-ketoisovalerate, are shown with their respective methyl-protonated amino acid products in Fig. 5C. The application of selective ILV methyl protonation has already proved to be a viable source for obtaining additional structural restraints, which will be further explored in the 1H-1H NOE section of Restraints for Structural Determination (Section 4).
A final tool seeks to exploit the amino acid bias typically observed in the transmembrane segments as an aid to reduce spectral complexity. Roughly 60% of the residues in transmembrane segments are represented by only six amino acids: Ala, Phe, Gly, Ile, Leu, and Val . Isotopic labeling of these six amino acids results in the labeling of a majority of residues in the transmembrane segments. This approach should allow for the preservation of useful segment connectivity information, while significantly reducing the spectral complexity. Additionally, this approach can easily be customized for a specific target protein by statistical analysis of its amino acid composition.
Sequential backbone resonance assignment is a prerequisite to fully extract and exploit the wealth of structural information embedded in NMR spectra. While smaller IMPs such as glycophorin A and Bnip3tm have been assigned using HSQC-based NMR pulse sequences [151, 211], resonance assignments for larger membrane proteins have benefited from TROSY and the continued adaptation of conventional multidimensional correlation experiments to include TROSY elements [178, 182, 212-215]. In making assignments for larger membrane proteins, it has often been found that 15N-selected NOESY experiments provide an invaluable route to resolving ambiguous correlations observed in higher dimensional experiments and establishing sequential residue connectivity.
Conventional backbone sequential assignment strategies  may be enhanced by alternative labeling schemes and higher dimensionality experiments [46, 84, 199, 214, 217]. The application of non-linearly sampled spectral acquisitions and spectral reconstruction methods has also been used to enhance the application of multidimensional experiments to membrane proteins [205, 218, 219]. Even for very large α-helical proteins and complexes, such as the KcsA tetramer, the use of simple correlation strategies employing backbone amide-amide 1H-1H NOEs may continue to be a viable method to obtain sequential assignments or to complement the use of higher dimensional correlation spectroscopy .
In expanding resonance assignment from the backbone to side chain resonances, major complications can be encountered. Unfavorable T2 relaxation severely reduces the amount of magnetization available for detection at the end of TOCSY-based pulse sequences, often resulting in complete signal loss. For helical membrane proteins, side chain assignments have been limited to small proteins or to flexible termini or loops [46, 199, 217, 220]. While perdeuteration serves to improve the relaxation properties, it also limits the ability to observe and correlate side chain 1H resonances arising from the sidechains. One particularly promising way around this problem is to assign the methyl groups of Leu, Ile and Val with pulse sequence methods tailored for use with protein protonated only on certain methyl groups of these residues and on backbone amides [88, 181, 184, 188, 201, 209, 221-223]. The power of this ILV-selective-methyl-protonation approach is exemplified by studies of the β-barrel OmpX in DHPC micelles and of the monomeric 82 kDa water-soluble malate synthase G (MSG) [48, 181, 183, 222-226]. Sequence-specific resonance assignments of Val-γ(1,2), Leu-δ(1,2) and Ile-δ1 methyl groups in OmpX were reported using 3D (H)C(CC)-TOCSY-(CO)-[15N,1H]-TROSY and 3D H(C)(CC)-TOCSY-(CO)-[15N,1H]-TROSY experiments, which correlate chemical shifts of side-chain carbons and protons with the amide spins of the following residue. In assigning the spectrum of MSG, labeling and pulse schemes were specifically optimized for this larger protein system. In this modified approach, COSY-type relays replaced TOCSY to allow directed coherence transfer from the methyl group down the sidechain to the assigned backbone amide sites or relayed through the carbonyl carbon to avoid transfer losses through the amide . However, it is sobering to note that that in the case of the 40 kDa DAGK in a 100kDa micellar complex, assignment of ILV methyl groups using this approach has met with only limited success . The main difficulty was obtaining unambiguous assignments through correlation of methyl groups with 13CO and 13Cα, owing to extensive 13C chemical shift degeneracy, a problem exacerbated by the low digital resolution of the indirect dimensions. The alternative route of correlating the methyl protons with the backbone amide 1H-15N was generally not possible because of the severe T2 relaxation-induced loss of magnetization that occurred during the COSY transfers from the methyl down to the amide position. The intensive method of making methyl peak assignments by systematically mutating each Ile, Leu, and Val residue can offer an avenue to resolve ambiguity ; however, in the attempt to apply this approach to DAGK, the single mutations often resulted in major rearrangements of peak patterns in the methyl-TROSY spectra due to structural perturbations. DAGK appears to be an especially difficult case, as some methyl group assignments have now been completed for some other helical membrane proteins [144, 186, 187, 205, 227].
Another approach that has been enabled by the increasing availability of 13C-detection-optimized cryoprobes involves the use of 13C-detection experiments. In terms of proton transverse relaxation rates, large diamagnetic membrane proteins are analogous to paramagnetic proteins. Bertini and his colleagues originally developed direct 13C-detection multidimensional experiments for the assignment of paramagnetic metalloproteins [228-232]. The Dö tsch and Pervushin groups have also independently reported 13C-detection experiments for large proteins [233-236]. Although not been widely used because of their inherently low sensitivity and lengthy recycle delays (because of very long 13C T1 times), the Girvin group successfully applied 13C-detect methods to the membrane transporter, Smr . Using 2D CACO, trHNCA, and trHNCO, over 80% of CACO correlations were resolvable, suggesting that this approach has significant potential as an alternative route to resonance assignments.
The notion that typical transmembrane helices are straight, nearly ideal helices which arrange tightly into three dimensional structures by ‘knobs-into-holes geometries’ has a long history . However, as the number of structures deposited in the PDB steadily increases, it has become clear that reality is typically more complex. A variety of membrane protein substructures such as kinks, highly curved helices, and re-entrant loops can be seen in the transmembrane domains of many structures. Here, we review NMR methods for mapping the membrane topology of membrane proteins, which include the use of paramagnetic probes, magnetization transfer-based methods, and amide/water H-H and H-D exchange.
A common route to mapping membrane protein-lipid interfaces is through detection of intermolecular detergent-protein NOEs . In interpreting detergent-protein NOEs for larger protein-micelle complexes, spin diffusion is often present that lowers the structural resolution by which these measurements can be interpreted. For example, the study of KcsA in SDS micelles revealed that the transmembrane segments were delineated by intermolecular SDS-amide proton NOEs . However, a well-defined periodicity is not seen in the NOEs patterns for TM segments, suggesting the presence of spin diffusion from amide sites on the detergent-exposed faces of the helices to neighboring amide sites not in direct contact with detergent.
The introduction of water soluble and lipophilic paramagnetic probes into membrane protein samples can also be harnessed to probe topology. HSQC/TROSY spectra are typically monitored as the paramagnetic agent is titrated into a micellar U-15N-IMP sample [220, 238, 239]. Peaks for sites that are fully exposed to the probe will exhibit extensive line broadening, while sites that are inaccessible will yield NMR peaks that are much less perturbed. Water soluble paramagnetic probes originally developed for applications in imaging, such as Gd-DOTA or Gd-DTPA (Fig. 6A and B), are typically employed. The inclusion of a highly polar, cage-like chelate around the paramagnetic ion serves to enhance the solubility of the contrast agent and minimizes the potential for exposed ligand sites on the paramagnetic ion to bind Asp or Glu side chains, a possible complication  that can be even further suppressed by including a modest concentration of acetate in the sample buffer . Because the observed line broadening is based on a stochastically diffusing ensemble of water soluble probes, the depth of penetration of the observable PRE into the micelle can be tuned by the varying concentration of the probe [91, 239, 241, 242]. Hydrophobic reagents such as 16-doxyl stearic acid and 5-doxyl stearic acid (Fig. 6C and D), are used to obtain results that inversely complement the results from water soluble contrast agents by selectively broadening resonances for sites buried within the membrane while producing a minimal relaxation enhancement for segments exposed to solvent [91, 240, 241, 243].
Nitroxides, as well as Mn2+-, Cu2+- and Gd3+-chelates have extended electron spin relaxation times and contain near-isotropic magnetic susceptibility tensors, such that their proximity to an NMR spin results primarily in enhancement of NMR relaxation rates (particularly 1/T2), with little induced change in chemical shift. In contrast, a second group of paramagnetic probes have been utilized that result in relatively little relaxation enhancement and instead perturb NMR chemical shifts through the induction of “pseudo-contact shifts” (PCS). Both free and chelated Co2+, a number of lanthanide(III) ions, and molecular oxygen (O2) have been employed as probes of the PCS class, with O2 representing a hydrophobic probe. A potential complication in the application of O2 as a paramagnetic probe is the requirement for samples to be run at elevated pressures (20-30 atm) in order to maintain sufficient concentrations of dissolved oxygen [244-247]. PCS-induced shifts in resonance positions are sometimes so large that resonance assignments originally made using diamagnetic samples cannot be applied to the PCS-affected spectra necessitating extensive resonance reassignment. When utilizing a probe for inducing PCS, one avenue for simplifying spectra is to incorporate 19F sites, either by expressing the protein in cell growth media containing fluorinated amino acids or by chemically modifying free Cys-SH positions using thiol-reactive fluorinated reagents [241, 244, 246, 248-252]. The sparse 19F NMR spectra are then recorded under both paramagnetic and diamagnetic conditions.
Another powerful approach for mapping membrane protein topology involves the use of cross-saturation methods, as originally developed by the Shimada lab to map the interfaces of protein-protein complexes [253-259]. These and related  methods involve monitoring the transfer of saturation from unlabeled protein or detergent to the amide protons of a 2H,15N-labeled protein, leading to reductions in TROSY/HSQC peak intensities that indicate the location of the target protein interface with the other protein or with detergent, respectively. This method has been applied to membrane proteins to explore the protein/detergent boundary . This experiment should be conducted in 90% D2O buffer solution to suppress the secondary saturation via water, while still allowing a detectable population of amide proteins to remain. Methods of this genre can be readily applied to probe ligand binding to membrane proteins [258, 261-263] and specific lipid-protein interactions [264, 265].
A final class of topology-mapping methods involve monitoring water-amide H/D or H/H exchange . Due to the small size of water and its ability to chemically exchange with labile amide protons, water-amide exchange can provide information that is complementary to the previously described probes. Amide site resistance to solvent H/H or H/D exchange is a reflection both of amide site location with respect to protein-medium/protein-protein interfaces and of the local protein structure and dynamics surrounding that site. While stable transmembrane helices will be exchange-resistant due both to solvent exclusion from the micelle interior and as a result of the stable hydrogen bonding network along the spine of the helix, more complex (and informative) exchange patterns are often seen at sites of helical deformation for segments that are partially exposed to solvent . For a number of membrane proteins, H->D exchange studies have been carried out, either to measure site-specific rates of exchange [268, 269], or to determine fractionation factors at a fixed incubation time and an array of D2O concentrations. These latter measurements reflect a combination of both kinetic protection and thermodynamic equilibration [150, 267]. While the simple kinetic characterization of H/D exchange can offer insight into the gross topology of a protein, protection arising from exclusion from solvent vs. hydrogen bonding cannot be readily differentiated. In expanding examination of H/D exchange to fractionation factors, the abilty of a given amide site to deviate from an occupancy reflecting the relative concentrations of labile protons and deuterons in the bulk solvent after reaching equilibrium offers strong suggestion to the presence or absence of a hydrogen bond. Amide sites involved in a strong hydrogen bond will tend to either retain or collect protons, if starting from a deuterated state, against the concentration of the bulk solvent. Sites not involved in hydrogen bonding or in very weak interactions will typically show an accumulation of deuterons, which has been postulated to arise from a weaking of the in-line stretching modes and restriction of the bending modes [270, 271]. More practically, the use of lower concentrations of D2O, relative to the high concentrations typically used observing the kinetic H/D exchange, can often allow observation of subtle differences in the exchange properties of exposed segments, arising from the presence of structural elements or amphipathicity, which will be prone to rapid exchange and may be overlooked.
In addition to H/D exchange measurements, amide site accessibility to water can also be monitored through the observation of 1H-1H NOE crosspeaks to water during NOESY experiments or using cross-saturation methods [253, 272-274].
The ability to observe and assess the dynamic states of membrane proteins across a broad range of timescales illustrates one of the most significant and unique strengths of NMR structural studies. Protein dynamics are intricately tied to important biological processes, such as catalysis and folding. In addition, details about protein motions provide information that is essential for the reliable application of residual dipolar couplings (RDCs) and paramagnetic relaxation enhancements (PREs) in structural studies. To date, most NMR-based dynamics studies of membrane proteins have focused on backbone motions. Due to the relative ease of observing all the backbone sites though simple 1H-15N-correlated spectra, backbone dynamics are typically assessed through the steady-state 1H-15N NOE and 15N R1 and R2 measurements.
A number of membrane proteins have been successfully examined using the 1H/15N steady-state NOE experiment [275, 276], for which TROSY-based versions are available . Because 1H-15N heteronuclear NOEs are sensitive to the mobility of the individual N-H bond vectors on the pico-to-nanosecond and micro-to-millisecond timescales, the heteronuclear NOEs along the protein sequence clearly reflect changes in local backbone motions . Assessed globally, backbone dynamics can point to which segments are involved in ordered tertiary structure and therefore warrant greater scrutiny for long-range contact information. Clear differences in dynamics can also be observed for segments of the protein that interact with micelles, which typically show the highest positive NOE values, contrasting with weakly associated amphipathic domains and soluble domains, which typically have additional motions that result in lower values. N- and C-termini and large interhelical loops often show negative heteronuclear NOE values, indicating a high degree of flexibility [46, 150, 151, 186, 217, 279-281].
Both conventional and TROSY-based versions of R1 and R2 relaxation experiments have been successfully applied to membrane proteins [218, 276, 282]. The overall rotational correlation time may be estimated from the T1/T2 ratio and averaged over residues showing the highest 1H-15N heteronuclear NOE [23, 46, 151, 199, 220, 283]. From this global correlation time (tC), the effective molecular weight of the complex can be estimated, allowing some insight into the possible oligomeric state of the IMP-detergent complex [46, 151, 199].
A popular and powerful method for analyzing 15N R1, 15N R2, and 1H-15N steady-state NOE relaxation data is to utilize the Lipari-Szabo or model-free formulism of the spectral density function [284, 285]. Mapping relaxation data to dynamics using a model-free approach links the amplitude (S2) of motion to timescale (te). S2 is the order parameter and reflects the amplitude of the motional properties of the 1H-15N bond vector. The order parameter varies from 0 to 1.0, with values approaching 1.0 indicative of rigidity and values nearing zero corresponding to highly dynamic regions. te is the local internal motion correlation time and provides the timescale of local amide bond vector undulations.
R1, R2, and steady state NOE experiment generally probe fast timescale events from picoseconds to nanoseconds. However, because chemical exchange contributes to T2 relaxation but not to T1 relaxation, one can estimate if local significant conformational exchange dynamics on the millisecond to microsecond time scale is contributing to the T2 relaxation rate at a given amide bond vector. The simplest approach in which to infer the presence of chemical exchange has been the determination of the mean T1/T2 ratio and to assign residues as exhibiting chemical exchange if the T1/T2 value is larger than the sum of the mean T1/T2 ratio plus one standard deviation unit . More precise NMR methods are available to probe slower motional events; these methods include longitudinal magnetization exchange, line shape analysis, CPMG relaxation dispersion, and R1ρ relaxation dispersion [287, 288].
The Carr-Purcell-Meiboom-Gill (CPMG) relaxation dispersion and spin-locking (R1ρ) relaxation dispersion methods are useful to probe differences between protein conformations on the micro-to-millisecond timescale [289, 290]. These approaches quantitate the relative equilibrium conformer populations, the conformational exchange rate constant (kex), and the difference in chemical shifts for the two conformations (Ωω). In these methods, R2 is measured as a function of the effective field. As ωe is increased the chemical exchange contribution to R2 is diminished and an R2 dispersion curves can be generated. The relaxation dispersion curve is then fitted to extract information about chemical exchange [287, 291, 292]. R1ρ and CPMG experiments give insight into slightly different timescales, with the exchange phenomena probed by R1ρ experiments being about an order of magnitude faster than CPMG.
Studies of PagP and KcsA provide two excellent examples of applying relaxation experiments to membrane proteins in order to better understand function. PagP is a β-barrel enzyme that catalyzes palmitoylation of lipopolysaccharides. Kay and coworkers thoroughly characterized the structure and dynamics of PagP. The dynamics of the two conformational states of PagP were probed with Nzz and 15N CPMG relaxation dispersion [283, 293]. Motional differences between conformational states helped to explain how a phospholipid substrate likely enters the β-barrel to reach what appears to be the active site.
The dynamics of the bacterial potassium ion channel KcsA have been thoroughly investigated in SDS micelles by Bax and coworkers [170, 269, 282]. They introduced a 3D TROSY-HCNO-based method to measure 15N relaxation parameters which allowed measurement of an impressively comprehensive set of R1, R2, heteronuclear-NOE, and cross-correlated relaxation values. These 3-dimensisonal experiments are especially useful when dealing with α-helical membrane proteins, which are notorious for limited spectral dispersion.
Use of NMR to provide insight into overall membrane protein tumbling can yield fundamental information, such as the oligomeric state, or can be used in sample optimization. An emerging NMR technique for approximating the effective rotational correlation time (tc) of large complexes is the “TROSY for rotational correlation times” (TRACT) experiment, which makes use of chemical shift anisotropy and dipole-dipole cross-correlation relaxation of amide groups to estimate tc. [15N, 1H]-TRACT experiments have been applied to OmpX and appear to be an efficient and straightforward method for elucidating the global correlation time . Similarly, translational diffusion coefficients can be determined using NMR and then interpreted using the Stokes-Einstein equation to give insights into aggregate size [295, 296].
Backbone and 13Cβ chemical shifts are usually interpreted to predict secondary structure through the chemical shift index (CSI) analysis  and/or to predict backbone torsional angles through database mining against fragments of known structure, such as employed by TALOS . In general, the extension of these prediction methods to IMPs is straightforward, with the main source of ambiguity arising from the difficulty of obtaining complete sets of measurements for IMPs. The Cβ shifts are often particularly challenging to determine in very large proteins and complexes, and perdeuteration can prevent measurement of the Hαshifts.
The combination of chemical shift data sets with powerful structural prediction methods such as ROSETTA is currently being explored as a possible route to reliable high resolution structure prediction and determination. Results for small water soluble proteins are very encouraging , and substantial efforts are underway to adapt these methodologies to IMP structure prediction [300, 301]. In addition to de novo structure prediction, development of methods for coupling sparse NMR restraints with structural prediction algorithms is underway. Preliminary examples of the use of hybrid NMR/modeling approaches to predict the structures of membrane protein complexes have appeared, with the results currently being subjected to follow-up experimental validation and refinement [302-304].
The classical paradigm for protein structure determination by solution state NMR spectroscopy is to extract and assign a dense network of 1H-1H NOEs in order to define the three-dimensional fold of a protein. Ideally the experimental component of this approach yields more than 7 NOE contacts per residue. In this paradigm “long-range” inter-residue NOEs, preferably across appreciable distances in the sequence, define the global fold. For β-barrel proteins, the dense amide-amide hydrogen bonding network between β-strands provides a reservoir of long-range NOEs that enable the successful application of traditional NOE-based structure determination [283, 305-307]. In helical membrane proteins the hydrogen bonding network only extends between turns of the helices, yielding backbone NOEs within the same helical element. While these contacts aid in defining the helices, they fail to provide long range information about the tertiary fold. For numerous water soluble proteins and for some smaller non-deuterated membrane proteins valuable long-range NOE contacts would normally be measured between protons in the side-chains [211, 220, 308, 309]. However, for larger helical IMPs such NOEs are often inaccessible due to sample perdeuteration or failure to complete sidechain assignments. For such proteins the problem of observing sidechain proton resonances/NOEs while maintaining the advantages of deuteration can be ameliorated by application of fractional deuteration schemes or selective ILV methyl protonation. This latter approach contributed to determination of the structures of the mostly-helical tetraspan IMP DsbB , OmpX , VDAC , the KpOmpA porin , the platelet integrin transmembrane/cystosolic domain heterodimer , and to studies of KcsA interactions with a toxin . The inclusion of even modest numbers of interhelical NOEs in conjunction with additional structural restraints, such as RDCs and PREs offers the possibility of drastically improving both the quality and the throughput of membrane protein structure determination.
In light of challenges associated with measuring long-range NOEs for helical membrane proteins, a number of alternative strategies have been explored to provide complementary structural restraints, including the controlled re-introduction of anisotropic interactions—dipolar coupling and chemical shift anisotropy (CSA). Historically, solution state NMR has benefited from the spectral dominance of discrete isotropic resonances and manageable relaxation times, as intrinsic to rapidly tumbling molecules. Unfortunately, the effective averaging out of anisotropic spin tensors under isotropic conditions also eliminates the distance and orientational information associated with dipolar coupling and CSA. Solid-state NMR has developed a number of methodologies to harness these strong anisotropic interactions for use as structural restraints, either by employing uniformly-aligned samples or by magic angle sample spinning and RF-modulated recoupling. The measurement and applications of anisotropic interactions have resurfaced in solution state NMR though marginal sample alignment. Weak alignment can be introduced into an isotropic sample through either steric or electrostatic interactions with an alignment medium such as bicelles, filamentous phage, strained or charged polyacrylamide gel matrices, or by the attachment of a strong paramagnetic director to the protein, such as a lanthanide ion [150, 310-317] [318-321]. Marginal alignment leads to the reintroduction of dipolar coupling into the NMR spectra, but at only 0.1-0.5% of the magnitude of the static anisotropic interactions. Dipolar couplings in the range of roughly −15 to +15 Hertz are observed for highly proximal (usually directly-bonded) spin pairs, while longer range couplings are so small that only a modest degree of undesired resonance broadening occurs.
For membrane protein structural studies, strained polyacrylamide gels have most frequently been used to impose marginal alignment on IMP/micellar complexes, a method that avoids direct association of IMPs with the matrix and/or for detergents to disrupt the alignment matrix. For strained polyacrylamide gels, several methods have been proposed to transfer the protein/detergent solution into the gel matrix including simple passive diffusion during a soaking process, co-polymerization, and electrophoretic migration [311, 322]. In practice, the application of each method suffers from its own shortcomings. Passive diffusion of the protein from a bathing solution into a hydrated hydrogel often results in some degree of dilution due to the large water content of the hydrogel. This can be minimized either by soaking the hydrogel in a large sample volume or by starting with a desiccated hydrogel. However, there appears to be some sensitivity of acrylamide hydrogels to detergents, such that a desiccated hydrogel may fail to fully rehydrate back to its initial dimensions. Polymerization of the gel in the presence of the guest protein has in some cases been shown to be successful. However, the free-radical initiated polymerization may produce undesired side reactions that damage or even immobilize the protein, as appears to be the case for micellar DAGK (Sanders, unpublished). Electrophoresis offers a gentle way to migrate the protein/detergent mixture into the gel matrix that can be further tailored to the charged state of the protein/detergent complex , although this technique has not yet been extensively applied.
Once the protein/detergent solution is in a gel, the gel cavities are deformed either by stretching a radially thicker gel into a narrower diameter tube or by compressing a longer gel to a shorter length.
Recently, several promising alternatives to polyacrylamide gels have been reported that employ detergent-resistant DNA assemblies: DNA nanotubes formed by helical bundles and stacks of G-tetramers [323-325]. The widespread applicability of these methods seems currently to be limited only by the availability and expense of these nucleic acid-based reagents.
A number of methodologies have been developed to accurately measure residual dipolar couplings, depending on the spin system targeted . For membrane proteins such measurements are particularly challenging because of the broad resonances arising from large aggregate size and, sometimes, from exchange broadening. The majority of residual dipolar couplings measured to date for membrane proteins involve directly bonded spin pairs: HN-N, Hα-Cα,N-CO, and CO-Cα. The RDCs between the HN-N and the Hα-Cα benefit from their larger magnitudes, but the need for perdeuteration often limits the accessibility of the Hα-Cα for RDC measurement. The N-CO, and CO-Cα couplings are harder to measure because they are often small compared to the broad line widths that are often associated with IMPs. Nevertheless, recent studies of KcsA, OmpA, and DsbB have demonstrated that these couplings can be measured even for sizeable complexes [186, 269, 313, 327].
An alternative to 1H-1H NOEs as a route to long-range distance information is found in the application of paramagnetic relaxation enhancement (PRE) measurements. This approach exploits the distance-dependent line broadening of NMR resonances caused by the interaction of an NMR-active nucleus with an unpaired electron. A number of reagents are now available for specifically derivatizing cysteine thiol groups in proteins with thiol-reactive nitroxide spin labels or with chelating agents to which a paramagnetic ion can be tightly complexed, several of which are shown in Fig. 6E-J [320, 328-333]. For IMPs that have multiple cysteine residues, this approach is generally preceded by mutagenesis to replace all reactive wild type cysteine sites in the wild type protein sequence so that single accessible cysteine sites can be reintroduced to probe different areas of the protein. In cases where removal of cysteine sites is not a viable option, an alternative approach involves the use of paramagnetic metal ion-coordinating polypeptide sequences inserted at the termini of the target protein [319, 320, 330, 334-338].
As with any study involving mutagenesis and/or the introduction of a covalent probe into a protein, there exists a strong need to re-affirm the functional state of the protein after amino acid replacement and paramagnetic labeling. Once a probe has been successfully incorporated, the effects of the PRE are typically measured by comparing peak line widths and resonance intensities between spectra acquired under paramagnetic and diamagnetic conditions. For nitroxide spin labels, sets of spectra are collected before and after quenching the nitroxide with ascorbic acid  or by comparing the spectrum from a spin-labeled sample with the spectrum from a matched sample labeled with a non-paramagnetic analog of the nitroxide probe . For paramagnetic ion-chelate tags (Fig. 6H-J), a spectrum, typically an 1H-15N TROSY, for a sample containing a paramagnetic ion (Gd3+, Mn2+, or Cu2+) is compared to a spectrum from a matched sample with a diamagnetic ion having similar coordination chemistry such as Ca2+ or La3+ .
For nitroxide probes, the changes in T2 observed between the paramagnetic and diamagnetic spectra yield long-range distances in the range of 12-23Å using the Solomon-Bloembergen equation . Peaks from nuclei closer than 12 Å to the spin label will usually be severely broadened, while peaks from nuclei greater than 23 Å from the probe will undergo only minor (difficult-to-quantify) reductions in intensity. The application of this PRE methodology has been demonstrated as a valuable source of structural data for several large membrane protein systems [186, 220, 302, 307, 330, 333]. Usually, the difference in resonance T2 between diamagnetic and paramagnetic samples is determined by measuring peak intensities in both states and the linewidth in the diamagnetic state for resonances which remain well-resolved under paramagnetic samples. Recent results suggest that there may be an insensitivity of the observed PRE to applied window functions, such as Gaussian multiplication and shifted sine bell processing, typically applied to provide resolution enhancement (Van Horn, et al. submitted). This is an important observation given the limited spectral resolution which is inherent in the spectra of many IMPs, for which resolution enhancement via a window function would dramatically increase the number of sites for which PREs may be accurately measured.
In the application of PRE restraints, there are several additional considerations that should be made. The first involves the extent to which the targeted site on the protein has been paramagnetically tagged . Incomplete tagging will give rise an attenuated PRE due to the underlying untagged protein. The persistence of resonances for the residue containing the paramagnetic tag or from closely neighboring sites is an indicator of incomplete tagging, indicating the need for additional purification or further reaction with the thiol-reactive reagent. Burial of the paramagnetic tag in the apolar environment of the micelle may complicate chemical reduction of the nitroxide spin label or exchange of the paramagnetic ion for a diamagnetic ion, in which case production of separate-but-matched paramagnetic and diamagnetic samples may offer more practical approach.
As noted earlier, paramagnetic ions with short electron relaxation times and highly anisotropic electronic magnetic susceptibility tensors generate relatively little PRE, but can induce sizable distance-dependent pseudo-contact changes in the NMR chemical shifts of nearby NMR-active nuclei. These probes include Co2+ and a number of the trivalent lanthanide ions. Pseudo-contact shifts can be employed in structure determination, although their interpretation is more complicated than for PREs because measurements are dependent not only on distance but also on the orientation of the non-isotropic electron spin tensor with respect to the target nucleus [340-344]. Because paramagnets with strongly anisotropic magnetic susceptibility tensors can also induce weak molecular alignment, care must be taken to account for potential RDC/CSA-induced perturbation of the observed PRE/PCS . Thiol-reactive chelating reagents are now available that are superior to previous reagents in that they lead to only a single stereoisomer of the metal ion complex instead of two or more isomers, each with its own susceptibility tensor (Fig. 6I and J) [314, 320].
The presence of local dynamics in a membrane protein can both help and hinder NMR spectroscopy and generally complicates the employment of restraints for structural determination. Spectral properties often benefit from the motion of dynamic segments, as is evident in the ease by which peaks are observed from the C-terminus of bovine rhodopsin  relative to more rigid domains of the protein. However, sharp and intense resonances from mobile segments can obscure observation of the broader underlying resonances from the transmembrane and micelle-associated segments because of dynamic range issues. A second problem arises from motions that occur at frequencies that result in exchange-broadening, a source of peak broadening which is not obviated by use of TROSY methods. Because internal protein motions can be highly temperature dependent, variation of temperature is often the most effective tool for driving the rate of unfavorable dynamics into a more spectroscopically-tractable fast or slow rate regime. The intriguing possibility that the membrane-mimetic media used in solution NMR studies of IMPs can be tailored to manage internal protein motions has not been extensively explored but offers the promise of further extending the application of solution NMR for the study of IMPs.
A more vexing concern is that structural interpretation of RDCs and PRE restraints is complicated by local protein dynamics. In the case of residual dipolar couplings, internal protein motions will further attenuate the RDCs. For structural studies where only a limited number of RDCs have been collected, measurements from segments with local motions are not typically utilized as primary restraints; however, they do retain the potential to be applied as lower limit restraints during refinement. In systems where an extensive set of RDCs have been acquired, there exists the possibility to fit to independent alignment frames to the RDCs from each dynamically-uniform segment and retain some of the structural context of the motionally averaged couplings. Additional insight is also provided by studies of denatured and intrinsically unstructured proteins, where statistical ensemble conformations generated by molecular dynamics calculations may facilitate the identification of transiently formed structural elements [346, 347].
Paramagnetic restraints are also very sensitive to local dynamics. The amount of PRE-induced line broadening that occurs when either the probe or the target nucleus is on a locally-mobile segment is not directly proportional to the averaged distance, but has an r−6 dependence that weights the distances of closest approach, even when such distances are infrequently sampled. Specifically, strong PREs may be observed between a spin label and NMR sites which have average distances too large for detection of a PRE but that transiently sample distances <12Å as a result of large amplitude motions at rates that are rapid on the NMR time scale (specifically when kex is much great than the degree by which the transverse relaxation rate is enhanced by the paramagnetic probe; see . For this reason, the ideal placement of the paramagnetic center should be on a rigid segment of the protein to allow straightforward interpretation of the PREs.
Closely related to the complications of dynamics in interpretation of PRE data is the question of how to treat the uncertainty and possible heterogeneity of the paramagnetic probe position during structural calculations, which reflects a lack of data defining the side chain conformation and dynamics at the probe-derivatized cysteine site. To address this point of concern, the reader is referred to previous discussions of this important problem in structure determinations that employ PREs or related EPR measurements [339, 348-350].
In the last few years, a burst of progress has established solution NMR as an increasingly routine method for studying the structures and interactions of multispan IMPs. While each new target protein typically produces a new set of challenges, we here survey noteworthy recent examples of applying solution NMR to IMPs. A gallery of multi-span IMP structures determined using solution state NMR methods as of April 2009 is shown in Fig. 7.
Many single-span transmembrane proteins and peptides have been studied in micelles using solution NMR [18, 302, 351-355]. In favorable cases, resonance assignments for small proteins can be obtained by simple 1H homonuclear spectroscopy, thus circumventing the need for additional isotopic labeling. The extension of early structural studies of monomeric single-span proteins to higher ordered oligomers usually requires more advanced 13C and 15N labeling, as exemplified by the NMR structural studies of the glycophorin A homodimer in DPC micelles . This study represented an important early accomplishment both for technical reasons and because of the insight that the glycophorin A structure provides into membrane protein folding and stability. Recent studies in our lab of the homodimeric single-span C-terminal domain of the amyloid precursor protein (100 resides plus 30 residue tag, 70 kDa micellar complex) required not only 15N and 13C labeling, but also perdeuteration and the use of a 900 MHz instrument in order to complete backbone resonance assignments . This study provided both structural insight into a medically-important protein and evidence that the amyloid precursor protein forms a stoichiometric complex with cholesterol, an observation that may be closely related to the native function of this protein and to the etiology of Alzheimer's disease. Determination of the heterodimeric transmembrane domain of the alphaIIbbeta3 integrin in isotropic bicelles (82 residues total) required the use of ILV-methyl-selective protonation as a route to partial side chain assignments and key NOEs between subunits .
Several recent studies have focused on higher oligomers such as the tetrameric M2 proton channel of the influenza virus in DHPC micelles . Building on elements applied in earlier studies of the phospholamban pentamer [356, 357], the M2 channel study illustrates how a balance can be achieved between the need to preserve essential protein biochemical features while tailoring the system to optimize the spectroscopy. To improve spectral properties while retaining the ability to form native-like tetramers in micelles, the 97 residue M2 protein was minimized to a 43 residue construct (residues 18-60) containing a small unstructured N-terminus, a channel-forming transmembrane helix, a short interhelical loop, and a short C-terminal amphipathic helix. Inclusion of very high concentrations of a channel-blocking drug, rimantadine, improved spectral quality by stabilizing a segment postulated to contain a binding site for the drug, although there is controversy as to whether or not this binding site is pharmacologically relevant [358-360]. Backbone resonance assignments were completed using TROSY-based pulse sequences on an 85%-perdeuterated protein. Relaxation properties were sufficiently favorable to also allow side chain assignments to be made. The tetrameric pH 7.5 closed-state structure was determined using intra-chain NOEs (230), inter-subunit methyl NOEs (20), sidechain dihedral restraints from 3J couplings (23), and 1H-15N residual dipolar couplings (27). The location of bound rimantadine was determined by measurement of 7 drug/channel NOEs. Shown in Fig. 8A is the structure that was determined for the tetrameric M2 channel from NMR data, which was constructed by the juxtaposition of the two helical segments, oriented with respect to each other within the monomeric unit by interpretation of the measured residual dipolar couplings. The coil connecting the two helical elements was not defined by NMR-based structural restraints and is representative of a typical connecting structure, although the criteria used to spatially position the amphipathic segment with respect to the transmembrane segment was ambiguous. By using NMR to carefully examine changes in M2 protein dynamics that occur as the pH is lowered, a mechanism for acid-induced activation of the channel was proposed. This model is largely consistent with conclusions derived from crystal structures of M2 determined at acidic and neutral pH, which were illuminated by a subsequent molecular dynamics study [359, 361].
PagP is an 18 kDa monomeric enzyme that catalyzes the transfer of a palmitoyl chain from a phospholipid to lipid A in the outer membrane of Gram negative bacteria. Studies of this protein by the Kay lab warrant distinction as the first β-barrel membrane protein for which an NMR structure was determined before an X-ray crystal structure was available. Moreover, PagP represents a case where the results of NMR studies were combined with complementary insight from a later X-ray crystal structure to generate a compelling hypothesis regarding the role of protein dynamics in the binding of lipid substrate to the enzyme. The initial NMR structure of PagP was determined in DPC micelles at 45°C, conditions in which the enzyme does not exhibit catalytic activity [283, 293]. Long range backbone amide-amide NOEs, backbone chemical shifts, and backbone J-couplings were employed for structural calculations. A crystal structure was subsequently determined for this enzyme in LDAO micelles that largely confirmed the NMR structure, but also showed a bound detergent in the central cavity of the beta barrel, suggesting that the active site of PagP is associated with this cavity . However, it was not clear from either of the original structures how lipid could enter the barrel from the surrounding membrane. This led to the hypothesis that PagP is inhibited under the conditions of the NMR experiment due to the occupancy of DPC in this cavity, which effectively acts as a competitive inhibitor of the enzyme. This prompted a search for a detergent that permits the catalytic activity of PagP to be maintained under NMR conditions, which led to the use of CYFOS-7 [293, 363]. This detergent has the same headgroup as DPC, but has a bulky cyclohexyl group in its tail that is apparently too large to enter the 8-stranded barrel of PagP. Gratifyingly, PagP was shown to be active in CYFOS-7. Elegant magnetization transfer and CMPG-based relaxation dispersion experiments were then used to demonstrate that PagP in CYFOS-7 can populate two interchanging conformational states, a relatively dynamic “R” state favored at higher temperatures resembling the structure observed in DPC and a more rigid “T” state conformation populated at lower temperatures [293, 363]. The increased dynamics of the R state are not structurally uniform but are localized to a large extracellular loop thought to be involved in the active site and adjacent segments of the beta barrel, suggesting that this part of the barrel serves as a dynamic portal for lipid entry.
Recent structural studies of larger β-barrel proteins serve to illustrate the NMR-approachable size and complexity of membrane proteins in micelles. Each of these ca. 280 residue proteins exists as monomers in their respective detergent micelles. Outer membrane protein G, OmpG, is a porin that contains 14 β-strands, the structure of which was determined by solution NMR in DPC. Using a suite of TROSY-based triple resonance experiments, backbone resonance assignments were completed for 234 of OmpG's 280 residues, with an additional 9 residues being partially assigned. Similar to most other β-barrel proteins, a large number of NOEs could be obtained between backbone 1HN-1HN sites (137 sequential, 46 medium-range, and 133 long range), which served as the primary source of global structural restraints.
Two NMR structure were recently determined for the 19 β-strand human voltage-dependent anion channel VDAC-1 in LDAO micelles. The VDAC structure determined by the Wagner laboratory employed sophisticated partial deuteration strategies and utilized long range amide-amide(131), methyl-methyl (56), and amide-methyl (85) 1H-1H NOE contacts as the primary source global structural restraints . Noteworthy in this study was the application of non-uniform sampling to facilitate the acquisition of two 4-dimensional NOESY experiments. In order to attain sufficient signal-to-noise to observe long-range NOE contacts, experiments were conducted using both perdeuterated protein and perdeuterated detergent. Resonance assignment for 80% of the backbone resonances were obtained through the combined use of TROSY-based triple resonance experiments and selective labeling schemes. The structure of this large eukaryotic β-barrel porin is very different from the more than 30 previous prokaryotic β-barrels of known structure in that VDAC-1 has an odd number of β-strands, with the N-terminus looping back through the pore to place both termini on the same side of the membrane. Binding events could be localized in the determined structure through the observation of chemical shift perturbations. Ligands titrated into VDAC samples included cholesterol, although the very limited solubility of cholesterol prevented a complete titration to determine whether spectral perturbations reflected a saturable binding event or non-specific interactions. The interactions of the anti-apoptotic protein Bcl-xL with VDAC were also characterized by chemical shift perturbations, transferred cross-saturation NMR methods, and non-NMR methods . This work serves to shed light on some of the protein-protein interactions occuring in the mitochondrial membranes and in association with apoptosis.
In a parallel structure determination of VDAC-1, a combined NMR/x-ray crystallography approach was utilized . VDAC-1 was incorporated in LDAO micelles and a set of TROSY-based triple resonance experiments (HNCA, HNCO, HNCOCA, and NOESY) were combined with amino-acid selective labeling to facilitate backbone resonance assignment for 192 of the 282 sites. To aid in the definition of the protein topology, a series of cysteine mutations were made. To each cysteine site, an MTSL tag was attached and chemical shift perturbations and PREs to surrounding residues were measured. Structure calculation employed NOEs (including 65 intra-strand NOEs), chemical shift-derived TALOS and SHIFTOR [298, 364] dihedral predictions, and PREs to generate an initial model that was refined against a 4Å resolution x-ray data set acquired on VDAC-1 in the detergent Cymal-5. Refinement was carried out using BUSTER-TNT . To aid in the spatial definition of the N-terminus, additional distance restraints were derived from mutation-induced chemical shift perturbations and from PRE broadening of N-terminal resonances after introduction of spin labels at sequentially-distal sites on the β-barrel.
Fig. 8B (left and center) shows the ribbon diagrams of the two NMR-derived structures determined for VDAC in LDAO micelles. The recently determined 2.3 Å resolution crystal structure of mouse VDAC-1 in bicelles correlated well with the structures determined by solution NMR data, reaffirming the unique topological features of this eukaryotic porin . The overlay of the two NMR-based structures with the backbone trace of the mouse VDAC-1 X-ray structure is shown in Fig. 8B (right).
While NMR studies of β-barrel proteins have historically outpaced α-helical membrane protein structures, significant progress has recently been made to increase the size and topological complexity of helical membrane proteins that can be successfully studied by solution NMR [150, 220, 356]. A particularly noteworthy accomplishment is the determination by the Bushweller lab of the structure of the integral membrane enzyme DsbB in DPC micelles . DsbB is a monomeric α-helical membrane enzyme containing four transmembrane segments that is involved in the bacterial periplasmic disulfide formation system. A 3.7Å crystal structure of this protein was previously determined in a complex with one of the other proteins in the system, DsbA, and with its ubiquinone co-factor . DsbB is responsible for mediating the formation of disulfide bonds in periplasmic proteins and then transferring the reducing potentials to uniquinone in the membrane. The disulfide bond exchange cascade is believed to be mediated by four cysteines in DsbB: C41, C44, C104 and C130, proceeding through a concerted series of disulfide bond rearrangements. The evidence that the intermediate formation of a disulfide bond between C41 and C130 represents a significant step in the reaction cycle of DsbB, coupled with the dramatic spectral improvement observed for the DsbB[CSSC] mutant (C44S, C104S) relative to the spectrum from wild type motivated focus on this mutant form for a detailed structural study. Using a suite of TROSY-based triple resonance experiments and uniform-2H/15N/13C-labeling, 98% of the backbone resonance assignments were completed. Methyl resonance assignments of the Ile-δ1-[13CH3] and Val, Leu-[13CH3,13CH3] labeled DsbB were completed using a HMCM[CG]CBCA experiment and a 3D [13C-F1,13C-F2] edited NOESY. The notable success of the DsbB study derives in part from the approach of leveraging structural restraints from a number of sources: traditional backbone 1HN-1HN NOEs, methyl NOEs derived from an ILV methyl-protonated sample, RDCs, and PREs. NOEs were recorded using a 3D 15N,13C-edited NOESY experiment, providing sequential (191), medium-range (216), and long-range (39) contacts. Residual dipolar couplings were collected in a compressed charged polyacrylamide gel using a TROSY-HNCO experiment for the one-bond HN (114), N-CO (109), and CO-Cα (114) couplings. PRE measurements were made using a series of cysteine mutants that were modified with a paramagnetic MTSL tag or its diamagnetic analogue in the presence of unmodified wild type residues C41 and C130. The nine spin-labeled samples yielded an additional 871 upper-bound and 273 lower-bound restraints. Comparison of the NMR structure of DsbB with the structure determined by x-ray crystallography revealed high similarity in overall structure (Fig. 8C (left and center)). However, the NMR structure filled in portions of the structure that were ill-defined in the crystal structure. Specifically, the NMR structure defined the conformations and packing of the N-terminus and the loop between the third and the fourth transmembrane helices, which failed to yield adequate electron density maps. The NMR studies of DsbB were also carried out in a mechanisticallysavvy manner, which led to considerable additional insight into the structural biophysical basis for the function of this redox shuttle.
The KcsA potassium channel represents a target for solution NMR studies that may well be near the limit of feasibility for total structural studies based on currently available magnet and pulse sequence technology. Each subunit of this 70 kDa tetrameric channel contains 160 residues, including two transmembrane helices connected by a segment that contains a “pore helix” that extends partway into the membrane and is followed by an extended strand adjacent to the 4-fold symmetry axis in the tetramer that loops back out of the membrane to compose the ion selectivity filter of the channel. Because of its high stability (even in harsh detergents) and the availability of both high level E. coli expression systems and crystal structures, the KcsA potassium channel has provided a venue for exploring the application of solution and solid-state NMR methodologies to large helical membrane proteins [19, 169, 170, 187, 258, 269, 282, 327, 368-370].
Early work focused on a truncated (residues 1-132) hexa-mutant form of KcsA that was engineered to bind charybdotoxin . Studies were carried out in DPC micelles. While the authors despaired of making backbone assignments, a variety of labeling schemes (including selective ILV methyl protonation) and brute force mutagenesis led to assignment of 73% of the I/L/V/A/M side chain methyl groups and 80% of the aromatic side chains for Trp and Tyr. This paved the way to measurement of a number of side chain/side chain NOEs and side chain/toxin NOEs, which were used to calculate structures for the channel tetramer-toxin complex, calculations that appear to have drawn somewhat on the crystal structure of KcsA to complement the use of the NMR restraints. In parallel work, the KcsA and agitoxin-2 interactions were studied  in dodecylmaltoside micelles. Though no assignments of KcsA resonances were reported, the channel binding interface on the toxin was mapped via cross-saturation between unlabeled KcsA and amide protons located at the channel/toxin binding surface on the U-2H,15N-labeled toxin.
The early studies of KcsA in DPC alluded to above were extended by the Riek lab , who tackled a toxin-binding mutant form of the full length protein. At 37°C, the correlation time was 60 ns, which corresponds to an aggregate tetramer/micelle mass of 130 kDa. Using TROSY-based methods and perdeuterated samples 85% of backbone resonance assignments were completed for the protein at both pH 4 and pH 7, believed to represent open and closed channel forms, respectively. Protein/detergent NOEs and NOESY exchange peaks were examined and relaxation measurements were carried out. These data allowed qualitative comparison of the conformations and dynamics of the two forms of the protein and characterization of exchange between the two states. Moreover, using site-specific labeling (including introduction of fluoro-tyrosine, which was then detected by 19F NMR) they were able to focus on specific residues in the selectivity filter and gate regions, leading to proposals regarding the structural and dynamic elements underlying channel gating and permeation. The Shimada lab has carried a similar but less extensive mechanistic study of the pH dependency of KcsA(1-125) in dodecylmaltoside micelles at 50°C. In their studies TROSY/HSQC resonance assignments were completed for Trp, Lys, and His residues using difference spectroscopy in conjunction with a series of single-site mutants .
The Bax lab has carried out extensive studies of KcsA(16-160) in SDS micelles at 50°C [170, 269, 282]. KcsA purified in SDS was shown to exist in a lipid-stabilized (and perhaps kinetically-trapped) tetrameric structural state that maintains the key topological elements of the native structure. However, when the protein is purified under harsher conditions and fully stripped of native lipid, the tetramer dissociates and then can exist as a stable monomer in SDS micelles. As a monomer, it was found that the protein is associated with about 60 SDS molecules (aggregate mass of 35 kDa), while the tetrameric complex contained 45 SDS per subunit (aggregate mass of ca. 115 kDa, correlation time = 40 nsec). Remarkably for both the tetramer at pH 6.0 and 8.0 and for the monomer at pH 6.0 and 4.2, 95% or more of the backbone resonances were assigned. This structural characterization by solution NMR validated an earlier EPR study , which had suggested the existence of one or more C-terminal helices in KcsA, which are not observed in the crystal structures, as well as revealing a previously undetected helix at the N-terminus. For the monomeric form of KcsA, numerous RDCs of various types were measured and it was found that the pH-dependent changes that occur for the tetrameric channel are also observed for the monomeric channel. For both the monomeric and tetrameric channel elegant relaxation studies were carried out using a novel 3-D TROSY-HNCO-based approach that is well-suited for application to large and particularly difficult spectroscopic targets.
Studies of KcsA using solution NMR methods recently took a radically novel direction in the lab of Yan Xu . Building on the pre-existing structure of KcsA determined by x-ray crystallography, a series of water soluble analogues of KcsA were designed by replacing lipid-exposed residues to enhance water solubility while preserving the native fold and oligomeric state . Of the three water soluble KcsA mutants tested, a construct containing 33 mutations within the membrane-spanning segments and truncations at both N- and C-termini was observed to maintain KcsA in a state that retains its tetrameric structure, the selectivity filter, and affinity for known ligands of KcsA [327, 372, 373]. For this water-soluble KcsA (WSK-3), backbone resonance assignments were completed using conventional triple resonance experiments and the structure was determined, principally from NOEs: intraresidue (403), sequential (299), medium-range (204), long-range (346), and inter-subunit (69) . The backbone dynamics of WSK-3 were also characterized in some detail and it was observed that the internal structural order of the water soluble form of KcsA is much lower than for KcsA in micelles or membranes.
Though the detergent/lipid-free structure determined by NMR for the WSK-3 mutant of KcsA exhibits some notable differences from the structure determined by x-ray crystallography (Fig. 8D (left and center)), the overall similarity is remarkable. While creating soluble analogs for structural studies of membrane proteins may not be generally applicable, this work represents an impressive demonstration of its potential.
Diacylglycerol kinase is a veteran target of membrane protein structure determination [123, 199, 241, 374-378] for which no X-ray crystal structure is available. Because of its readily accessible enzymatic activity, DAGK has served as a proving ground for extensively screening potential membrane mimetics for use in NMR and biochemical studies such as organic solvents, detergents, bicelles, and amphipathic polymers [12, 123, 145, 146, 174, 379, 380]. DAGK has also served as a model system for exploring membrane protein folding and membrane enzyme catalysis [123, 380-386]. DAGK has been utilized as a vehicle for novel methods development [248, 387] and for mutagenesis-based enhancement of membrane protein stability [388-390]. Similar in topological complexity to DsbB, each subunit of DAGK contains five α-helical elements: three transmembrane and two amphipathic. However, structural determination for DAGK is significantly complicated by the fact that the protein is folded and active only as a 41 kDa homotrimer.
The structure of DAGK in a ca. 100 kDa complex with DPC micelles has been determined using solution NMR from chemical shifts, intra-residue and sequential 1H-1H NOEs, 1H-15N RDCs, and PREs . These measurements were also supplemented with a dozen inter-subunit distance restraints derived from disulfide mapping measurements. These latter measurements were found to be necessary for defining a precise arrangement of transmembrane helices in the quaternary structure. DAGK exhibits a domain-swapped structure that features a portico-like lipid substrate binding site that can accommodate diacylgycerol or phosphatidic acid, but that excludes non-substrate lipids with larger head groups. Mutations at many sites in or near the active site portico were observed to lead not only to loss of catalytic function, but also in many cases to severe misfolding, indicating that the key determinants for folding of DAGK are highly overlapped with the key determinants of catalysis. The unique architecture observed for DAGK reveals that this protein is structurally unlike any other known kinase.
In the last two years leading up to this review there has been remarkable progress in the determination of G protein-coupled receptor structures using X-ray crystallography [392, 393]. However, the vast majority of receptors, hundreds in the human genome alone, have yet to yield high resolution structures. Moreover, most of the GPCR structures determined to date represent the inactive state (signaling-off) of the receptor, such that there is still no high resolution knowledge of the structural changes that occur upon receptor activation. While the contributions of solution NMR to the structural understanding of GPCRs are not yet extensive, the fact that GPCRs are similar in size and structural complexity to DAGK and KcsA suggests a bright future for the role of solution NMR in studies of GPCRs.
While the solution NMR results for bona fide GPCRs are relatively modest (see below), a recent study has reported 98% backbone assignments, secondary structural and dynamic analysis of a bacterial phototaxis sensory rhodopsin, pSRII in DHPC micelles . Though pSRII is in the category of microbial rhodopsins that are thought to be evolutionarily unrelated to mammalian GPCRs , with 241 residues and seven TM helices, it is possible to regard pSRII as a GPCR structural analogue. Studies of pSRII extend pioneering early attempts to characterize bacteriorhodopsin using solution NMR [217, 395-397].
The ability to observe the flash-photolytic properties of pSRII confirmed that the DHPC micelles used in this study maintained the protein in a native-like conformation . It is remarkable that pSRII produces a clearly resolved spectrum with sharp homogeneous resonances observable from all domains of the protein (Fig. 9A). There is little suggestion of conformational heterogeneity and/or exchange. The pSRII studies relied on TROSY-based experiments. NOESY measurements allowed for observation of the characteristic NOE patterns of the secondary structural elements, in strong agreement with the α-helices and the β-sheets previously observed in the crystal structure. A rotational correlation time of 21 ns, corresponding to an ensemble size of 50-70kDa, was measured from the T1/T2 ratio. While the aggregate molecular weight is smaller than will be typically observed for a detergent solubilized-GPCR, the spectral results clearly demonstrate that when it is possible to prepare a conformationally-homogeneous sample of a micellar protein with the same topology as a GPCR, significant NMR spectroscopic characterization can be made.
The impressive results for pSRII complement the DAGK, DsbB, and KcsA success stories and also the completion of assignments for a functional 7 transmembrane helical transporter domain, TehA , and appear to offer a possible glimpse into the future of what the solution NMR studies of GPCRs may hold .
Previous solution NMR-based studies of actual G protein-coupled receptors have most often involved fragments designed to mimic the conformation of the corresponding segments within the intact receptor [95, 398-403]. The tendency of some receptor sub-domains to adopt stable secondary or even tertiary structure can make characterization of these domains and their ligand binding properties a feasible alternative to studies of the intact receptor [400, 404-406].
For intact receptors, solution NMR has been used to study ligand binding [262, 407-411]. Changes in transferred-NOE patterns of small molecule agonists and antagonists induced by conformational re-arrangement upon binding  and also receptor-to-ligand saturation transfer difference spectroscopy  have been used to assess bound ligand conformations and to map the receptor-ligand binding interface.
Preliminary 1H NMR studies of the CC chemokine receptor 5 (CCR5) and the thromboxane A2 receptor have been reported under conditions in which these receptors are functional [412-414]. The receptors were expressed in eukaryotic systems and shown to be functional after purification. Lack of isotopic labeling limited NMR spectroscopic observation of these receptors to 1H-only NMR.
In conjunction with mutagenesis and chemical modification, rhodopsin has been double-site-specifically labeled with 19F, followed by 19F-19F NOE measurements to determine if pairs of fluorine probes were proximal in various signaling states .
In all likelihood, the greatest potential for future characterization of GPCRs by NMR will draw on the ability to utilize or improve existing NMR techniques so as to be able to simultaneously observe and correlate the vast majority of sites in uniformly labeled receptor samples. To date, NMR studies of uniformly-labeled GPCRs have been characterized by spectra that exhibit only a modest fraction of the expected number of resonances. Shown in Fig. 9B are the resonances observed for a 380 residue GPCR, the human Kappa Opioid Receptor Type 1, in LMPC micelles showing only a sparse set of resonances arising principally from the C-terminal segment. In these cases, the observed resonances generally appear to arise from segments of the protein undergoing motions independent of the overall protein/detergent ensemble, such as termini and large, mobile interhelical loops [23, 79, 413, 416, 417]. Observation of backbone resonances from the transmembrane segments of GPCRs receptors has proven to be a significant challenge for solution NMR due to extensive line broadening, complicated by the overlay of very intense signals from mobile segments .
The most extensive solution NMR study of a labeled GPCR to date is found in studies by Klein-Seetharaman and co-workers of rhodopsin in dodecylmaltoside (DDM) micelles, following biosynthetic labeling and purification from HEK293S cells. DDM was chosen because it has previously been determined to be a detergent capable of maintaining native-like function and stability for solubilized rhodopsin [79, 413, 416, 417]. In an initial study focusing on lysine backbone-15N-labeled bovine rhodopsin, only a single, well-defined resonance from the 11 sites was observed, which arose from the final lysine near the C-terminus . Though two additional lysine sites are found at the N- and C-termini and all but one site are predicted to be at interfacial regions of the protein, the only spectroscopic evidence of these remaining sites was found in poorly-defined resonances observable at elevated temperatures and in the presence of SDS, conditions believed to result in partial unfolding of the protein. The absence of the other lysine backbone resonances was attributed to intermediate time scale conformational dynamics, resulting in extensive line broadening. A second study of rhodopsin in DDM micelles, utilizing selective 15N-labeling of tryptophan, permitted observation of all five indole side chains despite the fact that four of the sites are located in transmembrane segments . The ability to observe these sites was interpreted to indicate that the indole side chains all adopt single well-ordered conformations. In contrast to the side chain sites, the backbone amides of tryptophan residues produced more resonances than discrete sites in the protein, suggesting the presence of multiple backbone conformations and backbone motions which are more extensive than for the corresponding side chain sites.
A third study by Klein-Seetharaman et al. involved rhodopsin labeled using two labeling schemes to encompass 49% of rhodopsin's residues: 15N-GKLQSTVW and 15N/13C-GKLQSTV(W). The HSQC spectrum shows a very intense and sharp set of peaks superimposed on a mass of broad and largely unresolved resonances. The sharp peaks were partially assigned and found to arise from the C-terminus, which was confirmed by relaxation measurements to be very mobile. Because the authors found that the use of TROSY relative to HSQC failed to generate an improvement in the appearance of the poorly resolved spectrum arising from most domains of the receptor, the very broad line widths most likely do not reflect the high overall molecular weight. It is more likely that the peaks are broadened by intermediate time scale conformational motions and/or heterogeneity, phenomena which cannot be obviated by TROSY. We suggest that the fine work of Klein-Seetharaman and co-workers points to a key issue confronting future solution NMR studies of GPCRs: is the extensive line-broadening seen in spectra from micellar GPCRs a reflection of non-native properties that are a consequence of purifying receptors out of their native membrane environment and into micelles or do the undesirable spectral properties reflect native-like receptor dynamics, which may be very different from the dynamics observed for most other functional proteins? In either case, how can such dynamics/heterogeneity be avoided and/or managed to enable useful solution NMR characterization?
The application of solution NMR techniques to α-helical membrane proteins is progressing at an accelerating rate. As the methods for preparing adequate quantities of membrane proteins continue to improve along with methodologies for solubilizing IMPs in membrane mimetics that preserve their native folds, solution NMR methods are beginning to chalk up impressive accomplishments in the structural biology of these proteins. These accomplishments have leveraged on the availability of very high field NMR magnets, the development of TROSY-class pulse sequences, and sophisticated isotopic labeling protocols. The effective size limit of a system that can be structurally-characterized by NMR is now above 100 kDa. As of mid-2009, nearly complete resonance assignments for a monomeric helical membrane protein with seven transmembrane segments have been attained and completion of the associated structure seems inevitable. This suggests a favorable projection for the contributions that solution NMR can be expected to make over the next few years to membrane protein structural biology, a point reinforced by the fact that a majority (75%) of all membrane protein-encoding ORFs have 7 transmembrane segments or less.
This work was supported by NIH grants RO1 GM47485, GM81816, and DC007416 (CRS), KOPRI PE08060 and PE09070 (HJK), the K-Mep Research Program T28031(YHJ), and the 21C Frontier Microbial Genomics and Applications Center M102KK010013-01310 (YHJ).
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.