|Home | About | Journals | Submit | Contact Us | Français|
The 3D-structure of the major surface viral antigen from the recent H1N1 pandemic influenza virus (A/Darwin/2001/2009) was determined to 2.8 Å resolution. The structure was used to analyze changes in the HA that have emerged during the first 11 months of the pandemic and have raised public health concerns. Receptor binding properties of this protein reveals a strict preference for human-type receptors.
The first influenza pandemic of the new century emerged in April 2009, when a new H1N1 influenza virus (H1N1pdm), found in patients in Mexico and the United States, spread rapidly across the world by human-to-human transmission, resulting in the World Health Organization declaring a global pandemic on June 11th 2009 . The pandemic H1N1 virus (2009 H1N1) was unique in that it had a gene constellation from both North American and Eurasian swine lineages that had not been isolated previously in either swine or human populations . Phylogenetic and antigenic analysis of the hemagglutinin (HA) gene revealed it to be distinct from seasonal human H1N1 viruses but more similar to the classical North American swine lineage.
Ten months after the first viruses were isolated, the virus is still antigenically homogeneous . However, as the HA continues to circulate in the human population, its HA antigenic sites will continue to be targeted by antibody-mediated selection pressure. Therefore it is important from a public health perspective to structurally characterize the hemagglutinin so that the research community has a template with which to visualize any changes affecting antigenicity or virulence that may emerge as this virus evolves. To this end, we have cloned, expressed and solved the structure of a pandemic H1 hemagglutinin by x-ray crystallography. The structure was used to analyze amino acid substitutions in the HA that have raised some concern during the last 11 months of global surveillance activities. The same protein was analyzed by glycan microarray and compared to seasonal and other pandemic variants. Results reveal a strict human-like receptor specificity.
Recombinant HA cloning and expression: Utilizing a similar cloning strategy from previous studies ,,, the HA ectodomain of the 2009 H1N1 pandemic influenza virus, A/Texas/05/2009 (Accession: FJ966959) was codon optimized, synthesized and cloned into the baculovirus transfer vector, pAcGP67-A (BD Biosciences, San Jose, CA) by Geneart AG, Germany. Constructs containing Ohio/7/2009 (Accession: FJ969535), A/Utah/20/2009 (Gisaid Accession: EPI217204) and A/Darwin/2001/2009 (Accession: GQ243757) were generated by mutagenesis of the A/Texas/05/2009 clone (A/Ohio/7/2009:Ser203Thr/Val411Ile; A/Darwin2001/2009:Ser203Thr/Arg205Lys/Val411Ile; A/Utah/20/2009:Asn156Asp/Gln293His) using the QuikChange Multi Site-Directed Mutagenesis Kit (Stratagene, CA). Seasonal H1N1 HA constructs were cloned into the baculovirus transfer vector, pAcGP67-A (BD Biosciences, San Jose, CA). Transfection and virus amplification were carried out as described previously ,,. Protein expressed from Trichoplusia ni (Hi5) cells (Invitrogen, Carlsbad, CA) in 10-stack CellSTACK™ culture chambers (Corning Inc., Corning, NY) was recovered from the culture supernatant and purified by metal affinity chromatography, subjected to thrombin cleavage and gel filtration chromatography . Purified monomeric protein was buffer exchanged into 10 mM Tris-HCl, 50 mM NaCl, pH 8.0 and concentrated to 7.8 mg/ml for crystallization trials. At this stage, the protein sample still contained the additional plasmid-encoded residues at both the N (ADPG) and C terminus (SGRLVPR).
Crystallization and data collection: Initial crystallization trials were set up using a TopazTM Free Interface Diffusion (FID) Crystallizer system (Fluidigm Corporation, San Francisco, CA). Crystals were observed in conditions containing various molecular weights of PEG polymer. Following optimization, diffraction quality crystals for Darwin09 were obtained at 20 ºC using a modified method for ‘microbath under oil’ , by mixing the protein with reservoir solution containing 22% PEG2000MME, 0.1M HEPES at pH 7.5. Crystals were flash-cooled at 100K, data was collected at the Advanced Photon Source (APS) beamline 22-BM at 100K and processed with the DENZO-SACLEPACK suite . The data were indexed in spacegroup P1 with unit cell dimensions a=73.98Å, b=109.71Å, c=129.90Å; α=86.25°, β=74.68°, =75.10°. Statistics for data collection are presented in Table 1.
Structure determination and refinement: The structure of Darwin09 was determined by molecular replacement with Phaser  using the HA structure from A/Japan/305/1957, pdb:3KU3  (HA1, 55% identity; HA2, 82% identity) as the search model. Six hemagglutinin monomers making one non-crystallographic trimer, related by a non-crystallographic 3-fold and three monomers that form one-third and two-thirds of two crystallographic trimers, occupy the asymmetric unit with an estimated solvent content of 55% based on a Matthews’ coefficient (Vm) of 2.75 Å3/Da. Rigid body refinement of the trimer led to an overall R/Rfree of 48.1%/48.6 %. The model was then “mutated” to the correct sequence and rebuilt by Coot , then the protein structures were refined with REFMAC  using TLS refinement . The final models were assessed using MolProbity . Statistics for data processing and refinement are presented in Table 1.
Table 1 Data collection and refinement statistics.
* Numbers in parentheses refer to the highest resolution shell. # Reference 
Glycan microarray analysis: Microarray printing and recombinant HA analyses have been described previously ,. Imprinted slides produced specifically for influenza research for the CDC using the CFG glycan library (CDC version 1 slides; see Table 2 for glycans used in these experiments) were used.
Table 2 Glycans covalently attached on the glycan microarray. Different categories of glycans on the array are color-coded in column 1 as follows: No color, sialic acid; blue, α2-3 sialosides; red, α2-6 sialosides, violet, mixed α2-3/ α2-6 biantennaries; green, N-glycolylneuraminic acid-containing glycans; brown, α2-8 linked sialosides; pink, b2-6 linked as well as 9-O-acetylated sialic acids; grey, asialo glycans.
Neu5Ac = Sialic acid
Neu5Gc = N-glycolylneuraminic acid
OSO3 = sulfate; Gal = galactose
Fuc = fucose
GlcNAc = N-Acetyl-D-glucosamine
GalNAc = N-acetyl-D-galactosamine
Glc = D-glucose
Man = D-mannose
9NAc = 9-O-acetyl
Expression and purification Recombinant HA protein from A/Darwin/2001/2009 (H1N1) (Darwin09) virus was expressed in a baculovirus expression system utilizing a thrombin site at the C-terminus of Darwin09 followed by a trimerizing sequence (foldon) from the bacteriophage T4 fibritin for generating functional trimers , and a His-Tag to aid purification. Although protein was expressed as a trimer, only monomers were purified by gel filtration after foldon removal by the thrombin cleavage step. However, these monomers were stable, the protein stock maintained its monomeric state even after 8 weeks storage at 4 °C (confirmed by dynamic light scattering analysis). However, monomers were still able to reform trimers in the crystal as evidenced by the structure reported here.
Overall Structure By using x-ray crystallography, the structure of pandemic H1N1 HA from the Darwin09 virus was determined to 2.8 Å resolution (Table 1). The overall structure of Darwin09 is similar to other reported HA structures with a globular head containing the RBS and vestigial esterase domain, and a membrane proximal domain with its distinctive, central helical stalk and HA1/HA2 cleavage site (Figure 1A). We selected representative HAs from human pandemic subtypes for structural analysis. Darwin09 HA was found to be structurally very similar to the 1918-pandemic HA and the pandemic potential H5N1 HA in comparisons. Although closely related to the HA2 domains of the other swine H1, H2 and H3 subtypes in the analysis, the HA1 domains were more divergent (Table 3).
Table 3. Comparison of r.m.s.d. (Å) for HA1 and HA2 domains. For analyzing differences in the overall structure, r.m.s.d. values were calculated between monomers or domains of different pandemic and pandemic potential HA’s, after the Ca atoms of the HA2 domains were superposed by sequence and structural alignment onto the equivalent domains of Darwin09.
Although six asparagine-linked glycosylation sequons are present in the Darwin09 HA monomer, interpretable electron density was observed at only 3 sites in HA1, Asn23, Asn87 and Asn276. At these sites, only one or two N-acetyl glucosamines could be interpreted. Compared to recent seasonal HAs, potential glycosylation sites in the HA1 of the pandemic HA are in comparable positions (Figure 1B and C). Position 87 in the pandemic HA is also a glycosylation site in seasonal HAs and has been a conserved feature since 1918 . On recent H1 HAs, a second site, at Asn54, is in very close proximity to Asn87 and it is not known whether both sites are occupied. Similarly, the pandemic HA also has two potential glycosylation sites at positions 276 and 286, at the bottom of the HA1 that are close together. However, no conclusions can be made from this structure with respect to double occupancy at these positions since density was only observed at position 276 in two of the six chains in the asymmetric unit.
The receptor binding domain The receptor-binding site (RBS) is at the membrane distal end of each HA monomer and its specificity for sialic acid and the nature of its linkage to a vicinal galactose residue determines host range-restriction. As for other HA structures, the Darwin09 RBS is composed of three structural elements: a 190-helix (residues 184-191), a 220-loop (residues 218-225), and a 130-loop (residues 131-135), while other highly conserved residues: Tyr91, Trp150, His180, and Tyr192 form the base of the pocket (Figure 1D).
Interestingly, previous published research highlighted dual receptor specificity for the early pandemic viruses . Using carbohydrate microarray analysis, the authors observed mixed a2-3/ α2-6 receptor specificity with two pandemic viruses (California/4/2009 and Hamburg/5/2009), while a seasonal H1N1 virus bound exclusively to α2-6-linked sialosides. Using recombinant HA we can also probe these microarray platforms ,,. By pre-complexing trimers using primary and secondary antibodies one can overcome the low affinity of HA for its glycan ligand  by increasing the valency. Results using recombinant HA revealed a strict preference for five human-type sialyl-glycans, with no significant binding to avian α2-3 receptor analogs. All pandemic recombinant HAs bound to a α2-6 sialylated tri-N-acetyllactosamine glycan in which the two proximal (reducing end) lactosamines are α1-3 fucosylated (glycan #57 in the Table 2) as well as to a structurally related long linear α2-6 sialylated di-N-acetyllactosamine (Figure 2, glycan #56). These glycans were detected in N-glycans of cultured human bronchial epithelial cells . Two other structurally diverse glycans, a α2-6 sialylated-sulfated N-acetyllactosamine structure (glycan #41) and the α2-6 sialylated LacNAc (glycans #53 & 54) were also recognized by these HAs. In addition, the proteins in this study bound weakly to α2-6 sialylated bi-antennary glycans (glycans #46-48), which are typically found on membrane glycoproteins . These results were comparable to the two seasonal HAs used in the analysis (A/Solomon Islands/3/2006 and A/Brisbane/59/2007 are the two H1N1 components of the 2007-2008, 2008-2009 and 2009-2010 trivalent vaccine) although good binding to the α2-6 sialylated bi-antennary glycans (glycans #46-48) was observed for the Solomon Islands/3/2006 recombinant HA. Thus, these pandemic viruses bind to human type receptors as shown and postulated by previous reports ,. This strict specificity is in contrast to the Childs et al report . However, these differences can be attributed to the different platforms used as well as increased valency of the virus, which might enhance binding to weak ligands.
Genetic and antigenic changes Four antigenic sites for H1N1 virus HAs, have been identified (Ca, Cb, Sa, and Sb) ,. In Darwin09, with the exception of Ca, all are exposed for antibody recognition. The Ca site is proximal to the oligosaccharide at HA1 Asn87, which may interfere with antibody recognition of this region. In recent seasonal H1 HAs the Sa site (and possibly Sb) looks to be affected by the presence of two glycosylation sites at positions 125 and 159 (Figure 1D). Lack of these sites in the pandemic HA exposes the entire top of the HA1 for targeting by the immune system and this feature may explain why the antibody recall response to the pandemic vaccine in adults was so effective .
Since the pandemic virus first emerged, the majority of viruses have shared a Ser203Thr amino acid change in the HA. This position is near the monomer-monomer interface and the small change in side chain appears not to have had a dramatic effect on the HA structure. Introduction of the extra methylene group in the side chain may help to stabilize the loop region in its surrounding environment (Figure 1A). Currently, two circulating subsets of viruses have amino acid changes, Asp222Glu or Glu374Lys, in the HA. The Asp222Glu mutation is in the receptor-binding site and may modulate which glycans bind to the receptor (Figure 1A). The latter mutation at position 374 is in the HA2 (residue 47) and points into the cavity where the fusion peptide resides in the mature fusion ready form of the HA molecule. Although this mutation may affect stability in this region (Figure 1A), it is also close to a region identified by two recent HA/neutralizing antibody structures which target the stem region of the HA ,. Little is known about the immune response to this region and whether this mutation is able to modulate antibody binding.
Other HA mutations have also been observed that affect antigenicity, but most have been sporadic throughout the year, geographically separated or results of egg growth . In particular, changes at positions 153-157 in the HA have been associated with reduced HI titers with ferret antisera to the A/California/7/09 vaccine virus. In most (if not all) cases, these changes have emerged after virus propagation in cell cultures. The structure highlights this region to be a prominent loop on the top left of the receptor binding site and is a component of the Sa (H1) or Site B (H3) antigenic site (Figure 1A and 1B) . In the pandemic H1 HA, this region is exposed to the host immune system and not masked by vicinal glycosylation sites. Although this position is known to affect antigenicity, it does not appear to change receptor binding as shown by the glycan microarray results for A/Utah/20/2009 which has as Asn156Asp change compared to the other pandemic virus HAs analyzed (Figure 2). Its ability to change easily also highlights this region as a potential ‘hot spot’ for future mutation as the human population gains immunity and the virus experiences increased pressure to evade the immune response.
More recently, there has been focus on the possible role of a mutation at position 222 and its role in severe clinical outcome ,. The Asp222Gly and Asp222Asn single and mixed variants have been found in pandemic viruses as well as direct sequencing from clinical specimens collected throughout the 2009 pandemic from approximately 20 countries, including Norway, Mexico, Ukraine and the USA. As already described above, position 222 resides in the receptor binding site of the HA protein and may possibly influence binding specificity. Indeed, the HA from the previous H1N1 pandemic in 1918 switched from avian to human receptor specificity through mutation at two positions (Glu187Asp and Gly222Asp) . (The pandemic virus HA is also an Asp at position 187). In addition, the A/New York/1/18 strain of the 1918 pandemic possessed a Gly at position 222 and this markedly affected receptor binding, reducing α2-6 preference and increasing weak α2-3 .
To address this question on the 2009 pandemic H1N1 virus, we mutated position 222 on the A/Texas/5/2009 HA to produce variants with either an Asp222Gly or an Asp222Asn mutation. Interestingly, glycan microarray analysis of these mutants revealed a α2-6 binding profile (Figure 3A and 3B) similar to the wild-type A/Texas/5/2009 recombinant HA (Figure 2). However, these mutants also bound weakly to sulfated α2-3 sialylglycans (glycans #4-8) as well as α2-3 and α2-3/α2-6 di-sialoside structures (glycans #9 & 10). Currently, it is unknown if the same profile will be reflected with viruses carrying the same mutations on the glycan microarray or if the increased valency of the virus due to the increased number of HAs on the virus surface will enhance this weak binding. Thus, on the current pandemic HA framework, the effect of these mutations at position 222 on receptor binding appears less dramatic when compared to the 1918 framework since the binding preference for α2-6 sialylglycans is still maintained. Analysis of the RBS of Darwin09 offers a possible reason. The galactose of α2-6-linked receptors can interact via its 3- and 2-hydroxyls through a hydrogen bond network using residues Lys219, Asp222 and Glu224. A loss of Asp222 through mutation might not compromise this network to the same extent as was seen in the 1918 HA framework when the Asp225Gly mutation was introduced .
Although a number of mutations have been reported in circulating pandemic H1N1 viruses, they have not affected virus antigenicity and pathogenicity. The use of the Darwin09 structure to analyze the interactions of these HAs with virus receptors highlights the importance of having structural information to aid such analysis. The expression system used here also provides an important route for the safe production of these pandemic proteins on a large scale. Availability of recombinant protein enables its use for downstream applications such as glycan microarray analysis, as described here, reagents for diagnostic kit development or as antigens for antibody production. If this methodology were not available, HA production from the virus would have been difficult at the start of the pandemic, due to stringent biosafety requirements. Rapid determination and dissemination of the pandemic H1N1 hemagglutinin 3-D structure and characterization of its receptor specificity should enable the medical and public health research community to develop improved intervention approaches to control and prevent influenza morbidity and mortality as this virus becomes endemic in human populations.
Use of the Advanced Photon Source at Argonne National Laboratory was supported by the U. S. Department of Energy, Office of Science, Office of Basic Energy Sciences, under Contract No. DE-AC02-06CH11357. The authors would like to thank the staff of SER-CAT sector 22 for their help with data collection and Ruben Donis (CDC) for help and advice during the project and preparation of this manuscript. The atomic coordinates and structure factors of the HA for Darwin/2001/2009 are available from the RCSB PDB under accession code 3M6S. Glycan microarray data presented here will be made available on-line through the Consortium for Functional Glycomics web site upon publication (www.functionalglycomics.org). The Glycan Microarray was produced for the Centers for Disease Control and Prevention using a glycan library generously provided by the Consortium for Functional Glycomics funded by National Institute of General Medical Sciences Grant GM62116. The findings and conclusions in this report are those of the authors and do not necessarily represent the views of the Centers for Disease Control and Prevention or the Agency for Toxic Substances and Disease Registry.
Research was funded by the Centers for Disease Control and Prevention.
"The authors have declared that no competing interests exist."