|Home | About | Journals | Submit | Contact Us | Français|
A high-resolution structure of the human MHC-I molecule HLA-A*1101 is presented in which it forms a complex with a sequence homologue of a peptide that occurs naturally in hepatitis B virus DNA polymerase. The sequence of the bound peptide is AIMPARFYPK, while that of the corresponding natural peptide is LIMPARFYPK. The peptide does not make efficient use of the middle E pocket for binding, which leads to a rather superficial and exposed binding mode for the central peptide residues. Despite this, the peptide binds with high affinity (IC50 of 31 nM).
Major histocompatibility class I (MHC-I) molecules are essential for the control of infections by intracellular pathogens (Klein et al., 1993 ). During chaperone-assisted folding in the endoplasmic reticulum (Elliott & Williams, 2005 ), each MHC-I molecule binds a peptide from the cytoplasm and transports it to the cell surface, where the peptide–MHC complexes are scrutinized by cytotoxic T lymphocytes (CTL). This sampling of the intracellular protein environment allows CTLs to identify and eliminate infected cells (Townsend & Bodmer, 1989 ). In humans, MHC molecules are referred to as human leukocyte antigens (HLAs).
Peptides bind to MHC-I molecules in a special peptide-binding groove, where six binding pockets (A–F) determine the specificity of the MHC-I molecule (Garrett et al., 1989 ). Most peptides binding to MHC-I molecules consist of 8–11 residues, but binding of peptides up to 14 residues in length has been reported (Probst-Kepper et al., 2004 ). MHC-I molecules are highly polymorphic and more than 1000 different variants, each with its own binding specificity, exist in the human population (http://www.anthonynolan.org.uk/HIG/index.html). Studies of the many MHC-I alleles have allowed a grouping based on peptide-binding preferences and so far 12 different supertypes have been identified (Lund et al., 2004 ; Sette & Sidney, 1999 ). More than 99% of all humans carry MHC-I molecules belonging to at least one of these supertypes.
HLA-A*1101 is known to be important for the control of infections by many different pathogens including HIV (Culmann et al., 1991 ), Epstein–Barr virus (Gavioli et al., 1993 ) and hepatitis B virus (HBV; Achour et al., 1986 ). It is one of the most common MHC-I alleles and is present in up to 27% of some Asian populations (Bodmer et al., 1999 ). Furthermore, it belongs to the A3 supertype, the second most common supertype, found in 44% of the human population (Sette & Sidney, 1999 ). Thus, peptides targeting HLA-A*1101 are attractive for inclusion in peptide-based vaccines as they afford broad population coverage and understanding HLA-A*1101 will aid in identifying such peptides. Here, we present the 1.6 Å X-ray structure of a complex of HLA-A*1101 with an HBV peptide homologue.
The complex of HLA-A*1101 was prepared as described previously (Blicher et al., 2005 ; Pedersen et al., 1995 ; Sidney et al., 1996 ). Briefly, the α-chain of HLA-A*1101 and β2-microglobulin (β2m) were produced recombinantly in Escherichia coli and purified under denaturing conditions. β2m was refolded on its own, while the complex was refolded by dilution of denatured α-chain into a folding buffer containing excess (folded) β2m and peptide (200 µM α-chain, 1.0 mM β2m, 2.0 mM peptide). Following stepwise concentration and centrifugation to remove misfolded protein, excess peptide and β2m were removed using size-exclusion chromatography (Superdex 75 column) and the complexes were concentrated to a final concentration of 6.0 mg ml−1 in 0.1 M MES pH 6.5.
The peptide used for folding of the HLA-A*1101 complex was purchased from Schafer-N (Copenhagen) as a crude preparation and was found to be approximately 87% pure (see §3). The peptide was synthesized using FMOC chemistry.
The final complex consists of residues 1–275 of the α-chain, corresponding to the extracellular part, residues 1–99 of β2m, corresponding to the native protein, and the peptide.
Crystals of HLA-A*1101 were prepared by vapour-diffusion experiments using the hanging-drop technique. Crystallization experiments were set up at room temperature (~298 K), with individual drops consisting of 2 µl protein solution (6.0 mg ml−1 in 0.1 M MES pH 6.5) and 2 µl crystallization buffer. Each drop was equilibrated over a reservoir containing 500 µl crystallization buffer. The crystals grew from a crystallization buffer containing 30% PEG 5000 MME and 0.2 M ammonium sulfate (Crystal Screen 2, Hampton Research), appearing as clusters of monoclinic plates within 4 d and reaching their final size of 0.3 × 0.1 × 0.01 mm in three weeks. Crystals were cryocooled in liquid nitrogen directly from the crystallization buffer. A data set to 1.60 Å resolution was collected at the ID14-1 beamline at ESRF (Grenoble, France) using a Quantum 4 CCD detector. The space group was P21, with unit-cell parameters a = 58.0, b = 79.7, c = 56.3 Å, β = 116.4° and one complex per asymmetric unit. The data set was processed with the HKL suite (Otwinowski & Minor, 1997 ), giving the data statistics outlined in Table 1 .
The structure was solved by molecular replacement (MR) with AMoRe (Navaza, 1994 ) using a structure of HLA-A*6801 without peptide or water molecules (PDB code 1hsb; Guo et al., 1992 ) as the model. The MR solution was subsequently fed into the program ARP/wARP (Perrakis et al., 2001 ) for removal of bias by iterative rebuilding of the entire molecule including the peptide. All subsequent refinement cycles included refinement of atomic positions and B factors and were carried out with SHELXL (Sheldrick & Schneider, 1997 ) with 5% of the data reserved for calculation of R free. Manual rebuilding and adjustments of the model during refinements were carried out using O (Jones et al., 1991 ). Details of the refinement process are listed in Table 2 . The model was refined with riding H atoms and restrained anisotropic B factors for all atoms (not hydrogen).
No B-factor cutoff was used during the last rounds of water refinement (previous rounds: 60 Å2). Rather, all water molecules with clear 2F o − F c density at 1σ and hydrogen-bonding partners within 3.5 Å were retained. The final model includes 274 of the 275 residues of the extracellular domain of the α-chain, all 99 residues of β2m, the HBV peptide homologue (AIMPARFYPK), three sulfate ions and 543 water molecules. The following 12 residues were modelled in double conformations: Ser2α, Ser11α, Arg35α, Asn86α, Met98α, Met189α, Thr216α, Gln226α, Lys243α, Thr4β, Val27β and Ser55β.
Peptide-binding experiments were conducted according to standard protocols (Buus et al., 1995 ). Briefly, binding of peptide to the α-chain was conducted as a folding by dilution competition assay in the presence of peptide and β2m as previously described (Pedersen et al., 2001 ). A titration series of peptide was mixed with a fixed amount of α-chain, β2m and radioactive tracer peptide (125I-labelled). The reaction was incubated for at least 4 h at 291 K in order to obtain a steady state. The peptide binding was examined by Sephadex G25 spun column chromatography (Buus et al., 1995 ) and the radioactivity in the MHC complex was counted and compared with the total radioactivity. The two peptides (AIMPARFYPK and IMPARFYPK) used in peptide-binding experiments were purchased from Schafer-N (Copenhagen) as HPLC-purified preparations more than 95% pure.
Liquid-chromatography mass spectrometry (LC–MS) was used to investigate the composition of the HBV peptide batch. The sample (1 mg ml−1 in water) was applied onto a C18 column controlled by a Hewlett Packard Series 1100 HPLC and eluted with a gradient of acetonitrile in water (0–50%) containing 0.02% trifluoroacetic acid (TFA). The eluting fractions were continuously monitored using electrospray ionization (ESI) mass spectrometry performed on a Bruker Esquire LC mass spectrometer.
The HLA-A*1101 complex was formed in the presence of ~87% pure IMPARFYPK peptide (see below) and crystallized. This peptide occurs naturally in hepatitis B virus DNA polymerase (residues 110–118) and binds to HLA-A*1101 (Sidney et al., 1996 ). Unexpectedly, it turned out that the crystals contained a peptide with an additional N-terminal alanine. The natural amino-acid residue in this position is a leucine (NCBI accession P03156), which makes the extended peptide a homologue of the natural sequence. The electron density of the peptide was very well defined and unambiguously showed the presence of the additional N-terminal residue (Fig. 1 ). Furthermore, the automatic and unbiased model-building program ARP/wARP implemented in CCP4i (Collaborative Computational Project, 1994 ; Potterton et al., 2003 ) assigned an extra residue to the N-terminus of the peptide, giving rise to the decamer peptide with sequence AIMPARFYPK. This finding prompted further investigation into the source of this peptide.
The most likely source of contaminating peptide species is the peptide stock used to generate the HLA-A*1101 complex. To analyze for the presence of peptide impurities, the peptide stock was subjected to mass spectrometry. This analysis showed an 87% content of IMPARFYPK and identified several minor contaminants (data not shown), of which one matched the AIMPARFYPK peptide and comprised approximately 1.4% of the stock. This N-terminally extended peptide was synthesized and tested for its binding to HLA-A*1101 in a peptide-binding assay and compared with the binding of the original HBV peptide (re-synthesized). The results are summarized in Table 3 and demonstrate similar IC50 values for the two peptides.
As the IC50 values of the IMPARFYPK and AIMPARFYPK peptides are very similar, the equilibrium conditions during folding of the HLA-A*1101 complex imply a relative distribution of the individual peptide complexes with HLA-A*1101 corresponding to that of the peptide stock solution. The fact that the HLA-A*1101 crystallized with the extended peptide impurity could be a consequence of preferential crystallization of HLA-A*1101 in complex with the extended peptide. As the overall structure of MHC-I molecules is generally largely unaffected by changes in peptide identity, any special peptide preferences not dictated by the peptide–MHC interaction itself would have to arise from direct interactions between neighbouring complexes in the crystal. A few crystal-packing interactions between molecules in the present structure involve the outer edges of the peptide-binding groove, but only in the part holding the peptide C-terminus. Furthermore, only one weak interaction between the peptide (Phe7p) and a neighbouring β2m molecule (Lys19β) is found. A more likely explanation is therefore the occurrence of differential non-equilibrium stability under the crystallization and/or purification conditions, since the relative distribution of peptides in HLA-A*1101 would be affected if the complexes have different half-lives. In biological settings, similar non-equilibrium conditions occur at the cell surface and in general only long-lived complexes will be available in amounts sufficient for T-cell recognition.
The structure of HLA-A*1101 was solved to a resolution of 1.60 Å with R cryst and R free values of 14.8 and 21.2%, respectively (see Tables 1 and 2 and Fig. 2 ). The protein complex is complete except for the C-terminal residue (Glu275α) of the α-chain, which could not be modelled owing to insufficient electron density. The largest positive peak in the F o − F c map is found in this region next to Trp274α and has a peak density of 0.44 e Å−3 (7.9σ, σ = 0.055 e Å−3). All other positive peaks are below 0.25 e Å−3. The largest negative peak (−0.42 e Å−3, −7.6σ) is centred on the S atom of sulfate ion 4002. This is most likely to be the consequence of an underestimation (hard restraint) of the anisotropic motion of this ion and/or incomplete occupancy. The occupancy was not refined for any of the sulfate ions as it couples very strongly with the B factor. In addition to the protein complex and three sulfate ions, the structure includes 543 water molecules.
The AIMPARFYPK peptide is bound to the peptide-binding groove of the α-chain through 25 hydrogen bonds, 12 of which are mediated by water molecules (Table 4 ). All hydrogen bonds are to the peptide backbone, except for the binding of the amino group of the C-terminal lysine side chain to the backbone carbonyl O atom of Asp116α. Not all potential hydrogen-bonding acceptors of the peptide residues are used for binding (see Table 4 ). In particular, the central residues Met3p, Pro4p, Ala5p and Arg6p exhibit large mobility as judged from the B values (Fig. 1 ).
As would be expected (see Fig. 3 ), Ala1p binds in the A pocket, Ile2p in the B pocket, Met3p in the D pocket and Lys9p in the F pocket, with the common hydrogen-bonding patterns at the N- and C-termini. Although Ala5p points towards the edge of the C pocket, it only makes contact with Asn66α. Arg6p rests on top of Gln155α between the D and E pockets and only makes a superficial contact with the binding groove. Phe7p rests against a hydrophobic patch on the upper part of the region between the C and F pockets. Peptides binding to HLA-A*1101 and HLA-A*6801 have previously been reported to make effective use of the E binding pocket using either the sixth (nonamer peptides) or seventh (decamer peptides) residue of the peptide (Li & Bouvier, 2004 ), which allows a relatively deep position of the peptide in the groove. The AIMPARFYPK peptide is positioned rather high in the groove even when compared with other A3 supertype molecules with decameric peptides bound (PDB codes 1qvo and 1tmc; see Figs. 2 and 4 ). This is a consequence of the bulky nature of the side chains of this particular peptide (i.e. residues Arg6p, Phe7p and Tyr8p), especially Phe7p, which apparently does not fit into the E pocket. Instead, Tyr8p occupies the upper part of the E pocket, leaving the lower part filled with water molecules. Pro9p is positioned in the region between the C, E and F pockets and its side chain primarily interacts with the side chain of Phe7p. In the absence of strong interactions between the central peptide residues and the peptide-binding groove, Arg6p interacts with Tyr8p and Phe7p with Pro9p, thereby stabilizing the peptide conformation.
Although the exposed bulging conformation of the AIMPARFYPK peptide in the binding groove approaches that of peptides of 11 residues (see Fig. 4 b), it is neither unusual for a decameric peptide nor does it constitute a general problem for immunogenicity (Burrows et al., 2006 ), since binding as well as TCR recognition of peptides up to 13 residues in length has been reported (Speir et al., 2001 ; Tynan, Borg et al., 2005 ; Tynan, Burrows et al., 2005 ). All residues from Pro4p to Pro9p are directly accessible for interaction with a T-cell receptor (see Fig. 3 ). Of note, the residue in position three (Met3p) is not accessible, a property also observed for the other A11 (e.g. PDB code 1qvo) and A68 (e.g. PDB code 1tmc) decamer complexes (Collins et al., 1995 ; Li & Bouvier, 2004 ). The N-terminal alanine residue is accessible for interaction, although very deeply buried in the groove. Molecular modelling indicates that there is plenty of room for this alanine to be replaced by a leucine without any significant rearrangements of either peptide or α-chain residues (not shown). Furthermore, T-cell reactivity data for HLA-A*0201 suggest that many T cells interact weakly or not at all with the first residue of the peptide (Hausmann et al., 1999 ; Lee et al., 2004 ). Taken together, this is likely to make the AIMPARFYPK peptide a good representative of the naturally occurring peptide LIMPARFYPK. In the context of vaccine design, the Met3p residue could be substituted with a polar residue such as glutamine or even a non-natural amino-acid residue to make more efficient use of the D pocket. This would enhance the overall binding properties of the peptide without affecting T-cell recognition and thus increase its potential as a HBV vaccine candidate.
PDB reference: HLA-A*1101 in complex with HBV peptide homologue, 2hn7, r2hn7sf
The authors would like to thank Anette Henriksen for assisting in data collection, Thomas Kofoed for assisting with the mass-spectrometric analysis, Kasper Lamberth for the peptide-binding measurements and Ole Kristensen and Osman Mirza for useful discussions and comments. This work was supported by Danish Medical Research Council grant 9601615, the Fifth Framework program of the European Commission (grant QLRT-1999-00173), the NIH (HHSN266200400025C) and the Danish Natural Science Research Foundation DANSYNC project.