|Home | About | Journals | Submit | Contact Us | Français|
We have determined the structure of the human integrin α1I domain bound to a triple-helical collagen peptide. The structure of the α1I-peptide complex was investigated using data from NMR, small angle x-ray scattering, and size exclusion chromatography that were used to generate and validate a model of the complex using the data-driven docking program, HADDOCK (High Ambiguity Driven Biomolecular Docking). The structure revealed that the α1I domain undergoes a major conformational change upon binding of the collagen peptide. This involves a large movement in the C-terminal helix of the αI domain that has been suggested to be the mechanism by which signals are propagated in the intact integrin receptor. The structure suggests a basis for the different binding selectivity observed for the α1I and α2I domains. Mutational data identify residues that contribute to the conformational change observed. Furthermore, small angle x-ray scattering data suggest that at low collagen peptide concentrations the complex exists in equilibrium between a 1:1 and 2:1 α1I-peptide complex.
Integrins comprise a family of non-covalently associated heterodimeric cell surface receptors containing an α subunit and a β subunit. They mediate interactions between individual cells as well as interactions between cells and the extracellular matrix (ECM).2 These receptors act as mediators that transmit bidirectional signals across the cell membrane. In outside-in signaling, signals are triggered by the binding of ECM ligands to the extracellular domain of the integrin and are propagated through the receptor to reach the intracellular domain and generate cellular responses. In contrast, inside-out signaling results from the binding of cytosolic molecules to the cytoplasmic domain of the integrin receptor, which can regulate the binding affinity of the receptors for ligands in the ECM (1).
Four of the 24 integrins are characterized by their ability to bind to collagen and are categorized as the collagen-binding integrins. All four collagen-binding integrins contain a β1 subunit that is in complex with an α1, α2, α10, or α11 subunit to form the α1β1, α2β1, α10β1, and α11β1 receptors. These four collagen-binding integrins vary in terms of their tissue distribution and cellular signaling pathways, but they share a common role of mechanically supporting cell adhesion to collagen in the ECM and maintaining tissue integrity (2). Of these, α1β1 and α2β1 are the best characterized and have been reported to play important roles in cellular processes such as angiogenesis, the regulation of collagen expression by fibroblasts, T-cell activation, and collagen-induced platelet aggregation (2–4). They have also been implicated in disease processes including inflammation as well as metastasis in certain cancers, making them attractive therapeutic targets (4–7).
All four collagen-binding integrins contain an inserted domain called the αI domain (αI), comprising ~200 residues, which is located in the extracellular region of the α subunit. The αI domains contain the major binding site for ECM ligands of these receptors, and recombinant isolated αI domains recapitulate many of the ligand binding properties of the intact integrins (8, 9). This makes the αI domains suitable models for the study of specific interactions between integrins and ECM ligands. All αI domains assume a Rossmann fold, which is composed of five parallel β-strands and one antiparallel β-strand surrounded by six α-helices. The αI domain coordinates a divalent metal ion at a site termed the metal ion-dependent adhesion site (MIDAS), which is part of the collagen-binding site. The metal-coordinating residues of the MIDAS are located in three loops on one surface of the αI domain (10, 11).
The four collagen-binding integrins bind differentially to different types of collagen. For example, α1β1 has a higher binding affinity for collagen type IV than for collagen type I, but α2β1 preferentially binds collagen type I over collagen type IV (12–14). Synthetic peptides that mimic specific collagen sequences and contain repeats of the amino acid motif GXX have been used widely in the study of integrin-collagen interactions. These peptides spontaneously self-assemble to adopt the triple-helical conformation that is found in native collagen. As these peptides form homotrimers, they do not recapitulate the heterotrimeric structure observed in some collagens. Nonetheless, they have proved to be extremely valuable tools to probe the binding specificity and structure of collagen-binding proteins (15–18). Using peptides such as these, several specific recognition motifs have been identified for α1β1 and α2β1 integrins (18–21). The recognition motifs are generally six-residue sequences flanked on either side by several repeats of GPO (glycine-proline-hydroxyproline). They include GFOGER (15), which is found in collagen type I, as well as GLOGEN and GROGER from collagen type III (18, 21).
A crystal structure has been reported previously for the α2I domain (α2I) in complex with a triple-helical peptide containing the GFOGER motif (22). Comparison of the liganded and unliganded α2I structures revealed that significant conformational changes occur on binding (22). This structure provided the first insights into the activation mechanism of collagen-binding αI domains. The major structural changes observed on ligand binding include 1) a rearrangement at the MIDAS that was caused by a change in the metal ion coordination involving a glutamate residue from the collagen peptide (GFOGER); 2) a 10-Å downward shift of the C-terminal helix (helix 7), which was proposed to form an interaction with the β1 subunit of the receptor and propagate the signal toward the cytoplasmic domain (22–26); and 3) an uncoiling of a one-turn-and-a-half helix (C helix), which is located above the MIDAS in the unliganded state. The uncoiling was proposed to open up the MIDAS to allow the collagen-mimetic peptide to bind. Ligand binding to α2I was accompanied by the breakage of a salt bridge between Arg288, which is located in the C helix, and residue Glu318 from helix 7. Loss of this salt bridge was proposed to be partly responsible for the uncoiling of the C helix that accompanies peptide binding. A single point mutation at Glu318 (E318W) showed increased binding affinity for collagen, suggesting the involvement of this conformational change in α2I activation (27, 28).
Because of their sequence and structural homology and their shared ability to bind to collagen, α1I and α2I domains have been proposed to use a similar activation mechanism to transmit signals. However, there have been no structures of any α1I-peptide complexes to confirm this hypothesis. The structure of the unliganded form of α1I reveals a salt bridge between residues Arg287 and Glu317, and the structure of α1I containing the mutation E317A was reported in 2011 (29). The E317A mutant showed enhanced binding affinity for collagen as had previously been observed following mutation of the similar residue in α2I (E318W), although the structure was different from that of α2I in complex with GFOGER. The C helix in α1I E317A uncoiled as observed in the α2I-GFOGER complex, but the displacement of helix 7 did not occur, and the metal ion at the MIDAS of α1I E317A was pentacoordinated, whereas the MIDAS in α2I-GFOGER was hexacoordinated. More recently, analysis of the solution structure of E317A using NMR (30) showed evidence of significant conformational change in the mutant. In addition, the E317A mutation led to increased binding to collagen and increased activation of the intact α1β1 receptor. In contrast, the mutation R287A produced less pronounced effects on collagen binding and receptor activation, whereas a charge-reversed mutant, R287E/E317R, displayed a phenotype more similar to E317A. This led the authors to propose that 1) the role of Glu317 in stabilizing the low affinity conformation of α1I is largely independent of the salt bridge with Arg287 and may result from interactions with helix dipoles in the structure and 2) mutation of Glu317 may be sufficient to result in displacement of helix 7 in contrast to the crystal structure (29). However, the NMR data demonstrated that the protein containing the E317A mutation was conformationally flexible, and it is possible that the crystal structure represents one state of E317A.
Here, we report the solution structure of human α1I domain in complex with a collagen-mimetic peptide. The triple-helical peptide contained a GLOGEN recognition sequence, which is a high affinity ligand of α1I (18). The binding orientation of the peptide is different from that observed for GFOGER binding to α2I, and a close comparison of the interfaces reveals differences that may account for their collagen binding preferences. Despite the difference in ligand binding orientations, the signaling mechanisms appear to be conserved. GLOGEN binding to α1I resulted in significant conformation changes including the uncoiling of the C helix, displacement of helix 7, and breakage of the Arg287-Glu317 salt bridge, consistent with what was observed in α2I domain upon binding. The structure of the complex revealed the formation of a new salt bridge between Glu317 and Arg171 from helix 1. However, R171A mutation had little effect on the structure of either the unliganded or the peptide-bound state from which we infer that the Glu317-Arg171 salt bridge is not crucial for the binding of collagen or α1I activation. In contrast, mutation R287A or E317A produced significant conformational change in the α1I domain. Analysis of the NMR data provided strong evidence that helix 7 was displaced in both of these mutants, suggesting that the salt bridge observed in the unliganded structure is important in maintaining the low affinity conformation. However, the two mutants showed different patterns of chemical shift perturbations in NMR spectra for residues at the collagen-binding site, suggesting that they have differential effects on the structure of binding site. These differences may explain the different effects of these mutations on the collagen binding affinity observed. During the course of the investigation, we also identified and characterized a dimeric complex of α1I-GLOGEN that supports the binding pose determined for the monomer. These data together provided a more complete structural profile of the collagen-bound α1I domain that advances our understanding of signaling and specificity in collagen-binding integrins.
Uniformly 15N-labeled, 15N,13C-labeled, and 2H,15N,13C-labeled human α1I was prepared as described previously (31). For expression of unlabeled α1I, the same protocol was followed except Luria broth (Sigma-Aldrich) was used instead of isotopically labeled minimal medium, and expression was initiated by induction with isopropyl 1-thio-β-d-galactopyranoside (Astral Scientific, Australia; 1 mm) at an A600 of 0.6.
A collagen peptide containing the sequence Ac-GPOGPOGLOGENGPOGPOGPO-NH2 was synthesized and purified as described previously (32) except that Fmoc (N-(9-fluorenyl)methoxycarbonyl) Rink-amide resin (Chem-Impex International) was used. The peptide is referred to hereafter as GLOGEN.
NMR experiments were conducted at 298 K on a Bruker Avance 800 spectrometer equipped with a cryoprobe. The sample contained 15N,13C-labeled α1I (1.2 mm) and unlabeled GLOGEN peptide (2.4 mm) in the NMR buffer (50 mm HEPES, pH 7.4, 50 mm NaCl, 5 mm MgCl2, and 10% 2H2O). The chemical shifts were referenced according to the method described by Wishart et al. (33) in which the 1H chemical shifts were referenced to the water peak, whereas the 15N and 13C chemical shifts were referenced by the 15N/1H and 13C/1H gyromagnetic ratios. For the backbone resonance assignments, two-dimensional 1H,15N HSQC, three-dimensional HNCA, three-dimensional HN(CO)CA, three-dimensional HNCACB, and three-dimensional HNCO spectra were acquired. Except where noted otherwise, spectra were processed using Topspin (version 3.0, Bruker-BioSpinTM), and the analysis was carried out using SPARKY (34).
To observe GLOGEN binding to α1I, a series of two-dimensional 1H,15N SOFAST HMQC (35) spectra on uniformly 15N-labeled α1I was acquired with stepwise addition of peptide. The samples contained α1I (100 μm) in NMR buffer. GLOGEN was added from a stock solution (10 mm in water) to give final peptide concentrations of 22.5, 45, 90, 180, and 360 μm. Spectra were recorded at 293 K on a Bruker Avance 600-MHz NMR spectrometer equipped with a cryoprobe. Chemical shift perturbations (CSPs) caused by peptide binding were measured from two-dimensional 1H,15N TROSY spectra of uniformly 2H,13C,15N-labeled α1I (400 μm) in NMR buffer in the presence and absence of GLOGEN (800 μm). Spectra were recorded at 293 K on a Varian INOVA 600-MHz NMR spectrometer fitted with a cryogenically cooled probe. CSPs were derived for each assigned residue using the following equation.
1H,15N TROSY spectra on uniformly 15N-labeled R171A, R287A, and E317A α1I mutants were acquired. The samples contained 275 μm R171A, 200 μm R287A, or 100 μm E317A in NMR buffer. To observe the binding of GLOGEN, the peptide was added to a final concentration of 560 μm in the case of R171A, 200 μm in the case of E317A, or 560 μm in the case of R171A. Spectra were recorded at 293 K on a Bruker Avance 600-MHz NMR spectrometer equipped with a cryoprobe.
T1 and T2 relaxation data were acquired for WT α1I as described by Farrow et al. (36) at 293 K on a Varian INOVA 600-MHz NMR spectrometer equipped with a cryogenically cooled probe. The unliganded sample contained 15N-labeled α1I (400 μm). The peptide-bound sample contained 15N,13C-labeled α1I (1.2 mm) and GLOGEN (2.4 mm). Samples were prepared in the NMR buffer. The relaxation delay was sampled at 0.05, 0.1, 0.2, 0.4, 0.8, 1.2, 1.6, and 2.0 s for the longitudinal relaxation measurement and 0.01, 0.03, 0.05, 0.07, 0.09, 0.11, 0.13, and 0.15 s for the transverse relaxation measurements. A two-dimensional heteronuclear 1H,15N NOE spectrum was acquired with 3 s of weak irradiation at either the center of the amide proton frequency range to generate the heteronuclear NOE or 10,000-Hz off-resonance for the no-NOE control. Spectra were processed using NMRPipe (37), and the signal decay was analyzed and plotted using SPARKY (34). Theoretical T1/T2 ratios for different structural models were generated using the program HYDRONMR (38).
For the side-chain assignments and generation of structural restraints, three-dimensional HBHA(CBCACO)NH was recorded on a Bruker Avance 500-MHz spectrometer fitted with a cryoprobe, three-dimensional 1H,1H NOESY 15N HSQC and three-dimensional 1H,1H NOESY 13C HSQC (aromatic) experiments were recorded on a Bruker Avance 800-MHz spectrometer fitted with a cryoprobe, and three-dimensional 1H,1H NOESY 13C HSQC (aliphatic) was recorded on a Bruker Avance 900-MHz spectrometer fitted with a cryoprobe. All spectra were recorded at 298 K. For the NOESY spectra, the mixing time was 60 ms, and the 13C-carrier frequency was 28 and 123.5 ppm for the aliphatic and aromatic 13C-edited NOESY spectra, respectively. CARA (Computer-aided Resonance Assignment) was used for spectral analysis. Automated structure calculation was performed using the software package UNIO-ATNOS/CANDID (39, 40) in combination with the torsion angle dynamics program CYANA 3.0 (41) following the protocol described by Serrano et al. (42). Structures were calculated initially in the absence of a Mg2+ ion. The 20 conformers with the lowest target function values were analyzed to determine the site of Mg2+ ion binding. Based on the CSP pattern of α1I on Mg2+ ion binding and the crystal structure of α2I-collagen peptide complex, the Mg2+ was assigned to coordinate with residues Ser152, Ser154, and Thr220 in our α1I structures. Subsequently, a Mg2+ ion was introduced to the structures using a “pseudo-link” consisting of 30 “pseudo-residues” extending from the C terminus. Structures were recalculated in the presence of the Mg2+ ion using UNIO-ATNOS/CANDID as described above with the addition of backbone torsion angle restraints generated by the program TALOS+ (43). In addition, hydrogen bond restraints (two per bond) were added in regions of canonical secondary structure where a unique hydrogen bond donor-acceptor pair was evident from structural convergence. The 20 structures having the lowest target function with Mg2+ bound were then refined using Cartesian dynamics in CNS (44) after removal of the pseudo-link to produce the final structures. The 20 lowest energy conformers with no NOE violations >0.25 Å, no bond violations >0.05 Å, and no improper or dihedral angle violations >5° were chosen to represent the solution structure of liganded α1I.
The 20 lowest energy structures of α1I in its liganded state were used as the protein input templates. Restraints that were used in the original structure calculation, i.e. NOEs, H-bonds, and dihedral angle restraints, were included in the docking process to constrain the protein in its liganded conformation. Regions of the protein with low T1/T2 ratios, low values of the heteronuclear NOE, and low angular order (S2) values predicted from chemical shifts were allowed to be fully flexible during docking. HADDOCK requires a set of ambiguous interaction restraints (AIRs) at the binding interface that are divided into “active” and “passive” categories where active residues are those directly implicated in binding from experimental data and passive residues are their near neighbors. Residues on α1I for which active AIRs were generated were selected based on their having NMR CSPs >0.5 ppm and a loss of NMR signal intensity >70% upon titration of α1I with GLOGEN. Exclusion criteria were residues in regions of the protein that showed fast time scale dynamics in the heteronuclear NOE experiment and residues that were not solvent-accessible.
For the peptide input template, the initial model was obtained by computationally modifying an available crystal structure of a triple-helical collagen peptide (Protein Data Bank code 1Q7D) (47). The peptide in the crystal structure contains a GFOGER recognition motif, which was converted to GLOGEN using the mutagenesis function in PyMOL (The PyMOL Molecular Graphics System, Version 1.3, Schrödinger, LLC). Although the peptide is a homotrimer, each of the strands is in a unique chemical environment, referred to as the “leading,” “middle,” and “trailing” strands, as viewed from their N termini. To make sure that the triple-helical structure was maintained during the docking, hydrogen bonds present in the crystal structure were included as input restraints. AIRs were also generated for the peptide based on previous studies that showed the essentiality of the glutamate residue in the GLOGEN motif. It was known that one of the three glutamate residues is responsible for coordinating with the metal ion at the α1I, but it was unclear whether any one of the glutamate residues in the trimeric peptide model is preferred over the other two. To predict whether α1I has a preference for coordinating to any one of the three strands, a preliminary docking was conducted with an ambiguous distance restraint set such that any of the three glutamate residues of the triple-helical peptide may coordinate with the magnesium ion. Residues within 8 Å of a glutamate were defined as active, and residues between 8 and 12 Å from the glutamate were defined as passive. This length matches the diameter of the binding interface of α1I and is likely to cover all potential residues on the peptide that could make direct interactions with α1I. Based on the lowest energy cluster arising from the preliminary dock, a unique peptide interface was defined, and seven active and eight passive residues were selected as the “optimized” peptide AIRs. The active residues comprised Hyp10, Gly11, and Asn13 from the leading strand and Hyp110, Gly111, Asn113, and Gly114 from the middle strand. The passive residues comprised Gly8, Leu9 from the leading strand; Gly108, Leu109, Pro115, and Hyp116 from the middle strand; and Pro215 and Pro216 from the trailing strand. The leading strand glutamate-magnesium ion interaction was assigned as an unambiguous distance restraint as previous studies have confirmed its essentiality for binding (10). The docking process included a rigid body energy minimization step, which produced 1000 structures. The best 200 structures were subjected to a semiflexible simulated annealing step and then a final low temperature flexible refinement in explicit waters.
Mutagenesis was performed using the QuikChange II mutagenesis kit (Agilent Technologies) according to the manufacturer's instructions. The pET28a plasmid containing the WT α1I insert was used as the template with the following primers used to generate the R171A and E317A mutations: R171A: forward, 5′-AGCTTTTTTAAATGACCTTCTTGAAGCAATGGATATTGGTCCTAAACAGACA-3′; reverse, 5′-TGTCTGTTTAGGACCAATATCCATTGCTTCAAGAAGGTCATTTAAAAAAGCT-3′; E317A: forward, 5′-AAGCATTTCTTCAATGTCTCTGATGCCTTGGCTCTAGTCACCATTGTTAAA-3′; reverse, 5′-TTTAACAATGGTGACTAGAGCCAAGGCATCAGAGACATTGAAGAAATGCTT-3′. The presence of mutations was verified by DNA sequencing. For the R287A mutant, a synthetic gene encoding α1I with the mutation in the pET28a plasmid was ordered from DNA2.0. The mutants were expressed in 15N-labeled minimal medium following the same protocol used for the WT α1I integrin.
Analytical SEC was carried out using a Superdex 75 HR 10/30 column (GE Healthcare) with a bed volume of 23.6 ml on an ÄKTATM purifier protein chromatography system. Samples (100 μl) containing α1I (10 μm) with or without GLOGEN (20 μm) were applied to the column equilibrated with 50 mm HEPES buffer, pH 7.4, 50 mm NaCl, and 5 mm MgCl2. A flow rate of 0.7 ml/min was maintained, and the elution was monitored by continuous measurement of UV absorbance at 280 nm (A280). The runs were conducted at room temperature.
All SAXS data were acquired at the Australian Synchrotron SAXS/WAXS beamline and the data collection and scattering-derived parameters are described in Table 2 according to the recommendations of the International Union of Crystallography Commission on Small-Angle Scattering (48). The SAXS experiments were set up in line with gel filtration chromatography as described by Gunn et al. (49). Samples (50 μl) containing unlabeled α1I (372 μm) with and without GLOGEN (744 μm) were injected onto a 2.4-ml gel filtration column (Superdex 75 PC 3.2/30, GE Healthcare) equilibrated with 50 mm HEPES buffer, pH 7.4, 50 mm NaCl, and 5 mm MgCl2. The runs were carried out at room temperature, and the flow rate was 0.2 ml/min. SAXS15ID software was used to analyze the detector images as averages of 10 sequential 2-s exposures, and the data were converted to individual I(q) SAXS profiles. The scattering intensity (I) was collected over the momentum transfer vector (q) range of 0.011–0.620 Å−1 (q = (4πsinθ)/λ where (2θ) is the scattering angle and λ is the x-ray wavelength, which was 1.0332 Å). For the final data sets, 367 and 357 data points were extracted within the range of 0.01–0.5 and 0.02–0.5 Å−1 from the original unliganded and GLOGEN-bound α1I data sets, respectively. The SAXS profiles were analyzed using the ATSAS program suite (50). The radius of gyration (Rg) was estimated using Guinier analysis using AUTORG. The maximum dimension of the scattering particles was estimated according to the pair distance vector distribution functions, P(r), using the program AUTOGNOM. The volume and mass of the scattering particles were estimated using AUTOPOROD, and the ab initio shapes of the scattering molecules were estimated using DAMMIF. The fitted models of the dimeric complex were constructed using PyMOL. The template monomeric complexes of the α1I-GLOGEN HADDOCK structure and the α2I-GFOGER crystal structure (22) were first duplicated. The two structures were aligned based on the six-residue recognition motif GLOGEN or GFOGER on different strands of the triple-helical peptides. Theoretical SAXS profiles of the models were generated and fitted to the experimental SAXS curve using CRYSOL (51). The statistical analysis of goodness of fit was performed as described by Mills et al. (52). The program OLIGOMER (53) was used to compute the proportions of monomer and dimer in both α1I and the α1I-GLOGEN complex.
The sequence of the peptide used in the study contained the integrin recognition sequence GLOGEN flanked by five GPO repeats. Its ability to self-assemble into the collagen-like triple-helical conformation was confirmed using circular dichroism (data not shown). The GLOGEN motif is the most potent ligand for α1β1 among the fibrillar collagen sequences, and it inhibits the binding of α1β1 to collagen type IV with an IC50 of ~3 μm (21). Comparison of two-dimensional 1H,15N TROSY spectra of the α1I domain in the presence and absence of the GLOGEN peptide revealed extensive CSPs upon addition of excess peptide (Fig. 1). Gradual titration of α1I with the peptide revealed that at a low GLOGEN:α1I ratio (<1:1) the two-dimensional 1H,15N SOFAST HMQC spectra of α1I exhibited severe line broadening for most peaks in the spectrum (Fig. 1C). The intensity of the peaks increased as the peptide concentration exceeded that of α1I, and at a GLOGEN:α1I ratio of ~3:1, most of the signals were recovered (Fig. 1D). Similar broadening has previously been reported upon titration of the α2I domain with a peptide containing the GFOGER recognition motif (54). Such broadening of NMR resonances is often the result of intermediate time scale chemical exchange. However, as the homotrimeric collagen-mimetic peptides potentially have three αI domain binding sites, the formation of a larger oligomeric complex at low peptide:α1I ratios is a possible explanation for the loss of signal intensity.
The oligomeric state of the α1I-GLOGEN complex was assessed using NMR by measuring the heteronuclear 1H,15N T1/T2 relaxation ratios for α1I domain in the absence and presence of a 2-fold excess of GLOGEN. The experimental data were compared with theoretical data generated using HYDRONMR (38) for the isolated α1I domain (Protein Data Bank code 1QCY (55)), isolated α2I domain (Protein Data Bank code 1AOX (56)), the monomeric GFOGER-α2I complex (Protein Data Bank code 1DZI (22)), and a model that consisted of two αI domains bound to one triple-helical peptide (Fig. 2). The model of the 2:1 complex was generated manually by docking a second α2I molecule onto the structure of the GFOGER-α2I complex. The theoretical T1/T2 ratios for the 2:1 complex were significantly higher than the rest of the models due to its size and shape. The fluctuation in the T1/T2 values is related to the orientation of the amide N–H bond vectors with respect to the principle axis frame of the rotational diffusion tensor in the calculation. This kind of variation is typical for elongated ellipsoidal proteins such as that of the 2:1 complex. The global T1/T2 ratio measured for the α1I-GLOGEN complex is in good agreement with the theoretical data derived for the monomeric complex and is consistent with the formation of a 1:1 complex between α1I domain and the triple-helical GLOGEN at high GLOGEN:α1I domain ratios.
The structure of α1I domain bound to GLOGEN was determined using a sample containing 15N,13C-labeled α1I (1.2 mm) in the presence of Mg2+ (5 mm) and unlabeled GLOGEN (2.4 mm). The pattern of CSPs observed upon addition of Mg2+ to α1I was consistent with the position of the metal ion in the crystal structure of GFOGER-α2I complex (31). The coordination of Mg2+ in our structure was therefore assumed to be similar to the α2I complex structure. An ensemble of 20 conformers representing the structure of α1I bound to GLOGEN (from a total of 40 structures calculated) is shown in Fig. 3. Structural statistics for the ensemble are shown in Table 1. The structure assumes a typical Rossmann fold consisting of five parallel β-strands and one antiparallel β-strand surrounded by six α-helices. Comparison of the lowest energy model with the unliganded structure of α1I (Fig. 4) shows that the central β-sheet core remains similar with an root mean square deviation on the Cα of 1.1 Å. Major conformational differences were identified in two regions. The first region is the C helix of the unliganded state that becomes completely unfolded upon peptide binding. Unwinding of the short C helix results in extending the βE-α6 loop (residues 281–292). Very few NOEs could be assigned in the βE-α6 loop in NOESY spectra of the complex. The TALOS+-predicted angular order (S2) values (43) showed a clear decrease in the predicted order in the βE-α6 loop of α1I upon binding to GLOGEN (Fig. 4A). This is supported by comparison of the 1H,15N heteronuclear NOE for α1I unliganded and in complex with GLOGEN that revealed a decrease upon peptide binding, consistent with increased dynamics in this region in the bound state (Fig. 4B). These findings are consistent with the lower precision observed in this region of the structure of α1I (Fig. 3A).
The second region undergoing major conformational change is the C-terminal helix (helix 7), which is displaced downward by 12 Å upon GLOGEN binding relative to its position in the unliganded structure. In the unliganded state, helix 7 is linked to the C helix by a salt bridge between Arg287 and Arg317. This salt bridge is broken in the complex with GLOGEN. In its new position in the complex, Glu317 is in a position from where it may form a salt bridge with Arg171 from helix 1 (Fig. 4F). To investigate the importance of the salt bridges in regulating the conformational changes, single point mutants with R171A, R287A, and E317A substitutions were studied (Fig. 5).
The 1H,15N TROSY spectrum of the R171A mutant (275 μm) in the absence and presence of GLOGEN (500 μm) showed a high degree of similarity to the spectrum of the unliganded and peptide-bound states of WT α1I, respectively. Chemical shift changes were only observed for residues located in close proximity to the mutation site in each case (Fig. 5, A and B), suggesting that the mutation had little impact on the structure of either the unliganded or peptide-bound state of α1I.
In contrast, the 1H,15N TROSY spectrum of R287A (200 μm) in the absence of GLOGEN showed substantial differences from the spectrum of the unliganded WT α1I. Comparison of the spectrum with the GLOGEN-bound WT α1I, however, showed a much higher degree of similarity (Fig. 5, C and D). In particular, a strong peak from the spectrum of the unliganded R287A coincides with the peak corresponding to the C-terminal residue, Ile331, in the GLOGEN-bound WT α1I (Fig. 5G). The chemical shift and intensity of this peak suggest that this cross-peak in the spectrum of the R287A mutant also represents the C-terminal residue, implying that helix 7 is displaced in the R287A mutant in the absence of peptide. Addition of GLOGEN (400 μm) to R287A induced relatively minor chemical shift changes, and the spectrum also resembles the spectrum of the GLOGEN-bound WT α1I. With the aid of the 15N,1H assignments of the GLOGEN-bound WT α1I, residues that were more severely perturbed by the R287A mutation in the bound state were identified. The majority of perturbed residues are located in close proximity either to the site of mutation or to residues within or adjacent to the MIDAS (Fig. 6).
As was the case with R287A, the 1H,15N TROSY spectrum of E317A in the absence of GLOGEN displayed significant differences from the spectrum of unliganded WT α1I. The spectrum appeared to be more similar to the GLOGEN-bound state of WT α1I (Fig. 5, E and F). The spectrum of E317A also contains a strong peak at a chemical shift that is coincident with the C-terminal residue (Ile331) in the GLOGEN-bound spectrum of WT α1I, suggesting that helix 7 in E317A is displaced (Fig. 5H). Addition of GLOGEN (400 μm) to E317A resulted in relatively minor changes to the spectrum. However, the pattern of perturbations relative to WT α1I was different from that observed with R287A. Fewer perturbations were observed for residues adjacent to the MIDAS in the spectrum of E317A. In contrast, the majority of the perturbed residues were located in helix 1, βE-α6 (the uncoiled C helix), and βF sheet (Fig. 6).
The CSPs and chemical exchange broadening that were observed in the two-dimensional 1H,15N TROSY spectrum of WT α1I upon GLOGEN binding were assessed to map the binding interface of the complex. CSPs and chemical exchange broadening are observed for residues that undergo a change in their chemical environment upon binding. This can be caused both by direct interaction with the ligand and indirect effects such as those due to conformational change. The analysis revealed that 22 residues lost more than 70% of their signal intensity on peptide binding. These residues were mostly located in the three MIDAS loops, loop βE-α6, and helix 6. CSPs on the other hand mapped both to the same regions and to residues in helix 7 (Fig. 4, C and D).
A structure of the α1I-GLOGEN complex was generated using the data-driven docking program HADDOCK (45, 46) (Protein Data Bank code 2M32). Based on the selection criteria described above, residues in helix 7 were excluded from the definition of the binding interface as they showed no broadening, and it was inferred that the CSPs were due to the observed conformational change in this region rather than a direct interaction with GLOGEN. In addition, residues 281–292 from the βE-α6 loop were excluded from the binding interface as they showed fast time scale dynamics in the heteronuclear NOE experiment and were therefore deemed to be disordered in the bound state. The residues of α1I defined as active for the HADDOCK calculation comprised the surface-exposed residues with CSP >0.5 ppm as well as residues with a loss of signal intensity >70%, namely Ser154, Ile155, Tyr156, Arg218, Gln219, Glu255, and His257. Only Asn153 satisfied the criteria for being a passive residue. These AIRs were used as input for the HADDOCK calculation. No intermolecular NOEs were included in the HADDOCK calculations. Analysis of the 200 final structures using their HADDOCK score (45, 46) produced a cluster of 10 structures, all of which fell within the 20 best scoring structures. This cluster was selected to represent the structure of triple-helical GLOGEN peptide bound to α1I (Fig. 7). The lowest energy structure of the 10 was selected to best represent the model of the complex. It shows the peptide binding along a “trench” above the MIDAS (Fig. 7C). Consistent with the α2I-GFOGER complex crystal structure (22), the α1I molecule binds exclusively to the leading (cyan) and middle (orange) strands as shown in Fig. 7F. The trailing strand (green) does not make any interaction with α1I. The binding orientation of the peptide, however, is different from the α2I crystal structure in which GFOGER sits along the edge of the trench (Fig. 7, compare C and D) (22). A more detailed discussion and comparison of the two structures is presented below.
Potential causes of the broadening observed in the 1H,15N TROSY spectrum of α1I at lower peptide concentrations were investigated using analytical SEC and SAXS. Analysis of the SEC elution profile of a sample containing α1I and GLOGEN peptide revealed the presence of two distinct peaks. The first of these eluted at a volume similar to unliganded α1I (11.62 ml), whereas the second, which was inferred to correspond to the α1I-peptide complex, eluted considerably earlier (10.02 ml) (Fig. 8). The >1.5-ml shorter elution volume suggested that the molecular weight of the complex species was much higher than that of the unliganded state. Comparison of the elution volume with other proteins with different molecular weight on the same column3 suggested that this peak corresponds to a species that is much bigger than the expected 1:1 GLOGEN-bound α1I species (molecular mass, 27.4 kDa).
Synchrotron SAXS data were acquired for α1I in the absence and presence of GLOGEN (Fig. 9 and Table 2). The SAXS experiments were set up in line with SEC. The sample containing the mixture of α1I and GLOGEN eluted earlier than the unliganded α1I sample, indicating a larger species in the mixture. Porod analysis of the scattering data indicated that the estimated molecular mass of α1I-GLOGEN complex was 40.7 ± 0.5 kDa, whereas that of the unliganded α1I was 19.2 ± 0.2 kDa. Although the SAXS-derived parameters such as Rg and volume were stable across much of the peak (varying by less than 2%) and consistent with a complex containing two α1I molecules, the resolution of the small bed volume column used was insufficient to cleanly separate the larger species from later eluting species such as 1:1 α1I-GLOGEN complex or free α1I. Indeed, single value decomposition analysis using SvdPlot in PRIMUS (53) of the entire elution consisting of 11 SAXS data sets from across the peak suggested a mixture of at least three species.
The Rg values, distance distribution functions P(r), and calculated maximum dimension (Dmax) (Fig. 9) also suggested the presence of a GLOGEN-bound α1I species that was considerably bigger and more elongated than the unliganded α1I or a putative 1:1 α1I-GLOGEN complex. Therefore, we postulated that a complex composed of two α1I molecules bound to a single homotrimeric peptide may have formed. The α1I domain can in theory bind to any two of the three strands of the peptide to form three different combinations of two α1I domains bound to one GLOGEN triple helix (2:1 complex). These three models were built using the lowest energy HADDOCK model as a template. Two of the three combinations showed significant steric clashes and were excluded from further investigation (data not shown). Although the SAXS data were likely to be from a mixture, the Porod mass estimate suggested that the volume fraction of any smaller species was reasonably small (~10%). Therefore, ab initio reconstruction from the SAXS was performed, and an average-filtered shape envelope was generated. Although the normalized special discrepancy of the shape envelopes was low (0.711 ± 0.067), a relatively high reduced χ2 statistic (χv2) of 2.13 ± 0.02 was noted from the DAMMIF reconstructions. This appears to be largely due to the detector count statistics of the synchrotron SAXS data underestimating the true uncertainty but may also reflect the inability of a single shape to adequately describe the scattering from the mixture. The HADDOCK-derived 2:1 model shows good shape correspondence with the SAXS envelope (Fig. 9). Despite the overall shape similarity between the HADDOCK-derived model and average-filtered shape (normalized special discrepancy of 0.94), the fit of the theoretical scattering profile from this model to the SAXS data generated using CRYSOL (51) was relatively poor with a χv2 of 5.3. There are several plausible reasons why this is the case. Again, the issues of detector count statistics and the presence of other species (e.g. free peptide, unbound α1I, and monomeric complex) would have contributed to the high value. In addition, the GPO repeats at the tails of the GLOGEN peptides from the HADDOCK model fit particularly poorly in the envelope, suggesting that they may be flexible. To address the issue of mixed species, OLIGOMER (53) was used to estimate the scattering contribution of the different species present. The best fit obtained was for 88% 2:1 complex, 7% 1:1 complex, and 5% free peptide. The χv2 for this volume fraction composition was 3.1, which represents a statistically significant improvement in the goodness of fit (F = 1.71; PF < 3 × 10−7) over the 2:1 complex alone or any other combination of species tested. It seems likely that further improvements in goodness of fit could be obtained by considering flexibility either in the GLOGEN peptide tails or in the βE-α6 loop and helix 7 and/or some variability in the relative α1I domain orientation in the 2:1 complex. Nonetheless, in relative terms, the HADDOCK-derived 2:1 model provides a statistically significant improvement in fit compared with any other 2:1 model tested including the corresponding 2:1 complex based on the GFOGER-α2I crystal structure, suggesting that the difference in peptide binding orientation observed between the α1I and α2I complex is genuine.
We have determined the structure of the α1I domain bound to a triple-helical peptide that is a mimic of its natural ligand collagen. The structure reveals that binding is accompanied by movement of the MIDAS loops as well as major conformational changes in the C helix (residues Leu282–Gly288), which became uncoiled in the bound state, and a displacement of the C-terminal helix (helix 7), which moved by ~12 Å relative to its position in the unliganded state. These structural rearrangements are consistent with those observed in the structure of the α2I upon binding to GFOGER. Further validation of the current structure is provided by a recent analysis of the dynamics of α1I in complex with collagen type IV using hydrogen-deuterium exchange mass spectroscopy (57). The study provided evidence of increased conformational mobility in the region between residues 283 and 290, corresponding to loop βE-α6 (including the uncoiled C helix), as well as the region between residues 318 and 332, which corresponds to helix 7. The intervening elements between the two regions also showed moderate conformational changes as compared with the unliganded state of α1I. This is confirmed by our NMR results where T1/T2 relaxation (Fig. 2), heteronuclear NOE experiments (Fig. 4B), the S2 values predicted from chemical shifts (Fig. 4A), and the NMR structures themselves (Fig. 3A) all indicate conformational changes and increased backbone dynamics in these regions of α1I upon GLOGEN binding.
The current structure provides a basis to investigate the mechanisms by which collagen binding, which is accompanied by subtle changes close to the MIDAS, leads to the large conformational changes that are thought to propagate signaling. In the peptide-bound state, the MIDAS rearrangement is associated with a change in the conformation of the side chain of Tyr285, which rotates away from its original position and exposes an accessible binding site for collagen or collagen peptides. The C helix unfolds, and the resulting βE-α6 loop is highly flexible as indicated by the NMR relaxation data and bends away from the collagen peptide so that it is not involved directly in the binding interaction. Peptide binding also causes a kink in helix 6 (residues Glu293–Ala303), which results in a small displacement (~4.5 Å) in the adjacent short β-strand (βF) and the large downward movement of helix 7 (residues Glu317–Glu329). This downward movement of helix 7 has been postulated to be key for integrin signaling and mediates communication of the α1I domain with the βI domain of the integrin β subunit (58). A new salt bridge between Glu317 and Arg171 was observed in the bound state, linking the displaced helix 7 to helix 1. To test whether this salt bridge is essential for stabilizing the active state of α1I, we made the R171A mutation and examined its effect upon the conformation of α1I in its unliganded and liganded states using 1H,15N TROSY NMR. However, the similarity of the WT and R171A spectra in both their unliganded and GLOGEN-bound states suggests that the salt bridge is not crucial for collagen binding or for α1I activation.
In contrast, either the R287A or E317A mutation resulted in large perturbations being observed in the 1H,15N TROSY spectra of the unliganded proteins relative to the spectrum of WT α1I. In the case of E317A, the perturbations we observed are consistent with a previous study (30). In that case, the authors proposed that E317A represents an activated conformation of α1I as the mutation of Glu317 enhanced signaling as shown by increased ERK activation and down-regulation of collagen synthesis. It was suggested that the E317A mutation led to a downward displacement of helix 7 not due to disruption of the salt bridge with Arg287 but as a result of the loss of a favorable monopole-helical dipole interaction between the negative charge of Glu317 and the partial positive charge of the helix 7 dipole (30). In support of this interpretation, the R287A mutation in the same study did not result in similar activation of the intact receptor.
Our results do not support this interpretation. We have acquired 1H,15N TROSY for both mutants, and with the benefit of full assignments for WT α1I, the spectra provide strong evidence that helix 7 is displaced in both cases. However, it is also apparent from analysis of the spectra that the two mutations have different effects on residues at the collagen-binding site of α1I (Fig. 6). Thus, we believe that the differences observed in the previous study may have resulted from the mutations altering the collagen binding affinity of the two proteins.
The HADDOCK model of α1I in complex with GLOGEN shows that the peptide binds along a surface trench above the MIDAS on α1I (Fig. 7C). In contrast, in the structure of α2I in complex with GFOGER, the peptide binds at the edge of a similar trench (Fig. 7D) (22). Our SAXS data support the HADDOCK model of α1I-GLOGEN, whereas a model based on the α2I-GFOGER crystal structure showed a very poor fit to the SAXS envelope. Analysis of the binding interfaces provides some indications as to the structural basis of these differences. First, the asparagine residues of the leading strand GLOGEN motif sits in an acidic pocket formed by Asp253 and Gly254 of α1I (Fig. 7E). Replacing the asparagine with a bulkier arginine could potentially create a steric clash in that pocket, which may account for the observation that GLOGEN binds more potently to the α1I domain than GFOGER. Second, comparison of the sequences indicates a change where residue Ala218 in α1I corresponds to Asp219 in α2I. In the structure of the α2I-GFOGER complex, Asp219 is located at the edge of the trench and forms a salt bridge with one of the arginine residues in the GFOGER motif. The change to Ala218 in α1I precludes the formation of a similar interaction in the α1I complex, which may contribute to both the peptide binding preferences observed for the two αI domains (15, 21) and the subtle change in binding orientation of the peptides between the two complexes.
Homotrimeric helical peptides have been used widely as collagen mimetics and represent excellent tools to study the binding of collagen on collagen receptors. They have been especially helpful in the process of identifying specific recognition motifs for collagen-binding integrins (17). For such homotrimeric peptides, there are potentially three equivalent binding sites that may support binding to integrin αI domains. Therefore, in a case where the individual strands in the peptide are in the appropriate orientation for αI binding and no steric hindrance is present, more than one strand in the same peptide should in theory be able to bind to an individual αI molecule simultaneously. Herein, our size exclusion and SAXS results provide evidence for the existence of a dimeric complex for α1I bound to GLOGEN. To our knowledge, this is the first report describing a dimeric complex between an αI domain and a triple-helical collagen peptide, but we believe that the SEC result presented by Lambert et al. (54) may also demonstrate the same phenomenon in α2I upon peptide binding, although it was explained as an artifact resulting from the elution of the rigid, elongated helical peptide. In the case of α1I bound to GLOGEN, T1/T2 NMR relaxation data confirmed the formation of a 1:1 monomeric complex at higher peptide concentrations. Based on these observations, we propose that in the presence of GLOGEN α1I adopts a dynamic equilibrium between an unliganded state, a dimeric complex state ((α1I)2-GLOGEN), and a monomeric complex state (α1I-GLOGEN). Under conditions where the GLOGEN concentration is low with respect to the α1I, the dimeric complex state dominates, but as the peptide:α1I ratio increases, the equilibrium position shifts to favor the formation of monomer. However, as collagen is abundant in the ECM and αI domains recognize heterotrimeric collagen sequences, we believe that this dimeric complex is unlikely to dominate in vivo and may not be biologically significant.
The structure of α1I bound to GLOGEN suggests that the mechanism of signaling in this receptor is similar to that reported previously for α2I. However, the structure suggests that the mode of peptide binding to α1I is subtly different from that observed with α2I. The current structure provides a rationale for the observed differences in binding specificity between α1I and α2I.
Part of this research was undertaken on the SAXS/WAXS beamline at the Australian Synchrotron, Victoria, Australia; we thank Dr. Nigel Kirby and the other beamline staff for their assistance.
*This work was supported in part by a grant from the Monash-Nottingham Research Fund.
Chemical shifts of all assigned resonances of the GLOGEN-bound α1I were deposited in BioMagResBank under accession number 18942.
3The typical elution time of albumin and ovalbumin from a gel filtration column (Superdex 75 HR 10/30 column, GE Healthcare) was published by European Molecular Biology Laboratory (EMBL) on the website www.embl.de/pepcore/pepcore_services/protein_purification/chromatography/superdex75HR10-30.
2The abbreviations used are: