|Home | About | Journals | Submit | Contact Us | Français|
Current knowledge of molecules involved in immunology and allergic disease results from significant contributions of X-ray crystallography, a discipline that just celebrated its 100th anniversary. The histories of allergens and X-ray crystallography are intimately intertwined. The first enzyme structure to be determined was lysozyme, also known as the chicken food allergen Gal d 4. Crystallography determines the exact three-dimensional positions of atoms in molecules. Structures of molecular complexes in the disciplines of immunology and allergy have revealed the atoms involved in molecular interactions and in mechanisms of disease. These complexes include peptides presented by MHC class II molecules, cytokines bound to their receptors, allergen-antibody complexes, and innate immune receptors with their ligands. The information derived from crystallographic studies provides insights into the function of molecules. Allergen function is one of the determinants of environmental exposure, which is essential for IgE sensitization. Proteolytic activity of allergens or their capacity to bind lipopolysaccharides may also contribute to allergenicity. The atomic positions define the molecular surface that is accessible to antibodies. This surface in turn determines antibody specificity and cross-reactivity that are important factors for the selection of allergen panels used for molecular diagnosis and for the interpretation of clinical symptoms. This review celebrates the contributions of X-ray crystallography to clinical immunology and allergy, focusing on new molecular perspectives that influence the diagnosis and treatment of allergic diseases.
In 1912, the German physicist Max von Laue published a demonstration of X-ray diffraction from a crystal of copper sulfate pentahydrate.1 This pioneering event led to a Nobel Prize two years later and marked the beginning of X-ray crystallography. The discipline celebrated its 100th anniversary in 2014. Crystallography started in hands of physicists with studies of non-biological molecules, and extended to the fields of chemistry, biology, and medicine. Crystal structures define the spatial location of atoms and the folding of macromolecules involved in biological processes. The first X-ray diffraction data from a small protein, pepsin, was collected by Bernal and Crowfoot in 1934 (Fig. E1A).2 Myoglobin and hemoglobin were the first protein structures obtained by John Kendrew in 1958, and by Max Perutz in 1960, respectively.3,4 In 1965, the first enzyme structure to be determined by David Phillips was hen egg-white lysozyme, which happens to be the chicken food allergen Gal d 4.5 With the advent of more powerful X-ray sources (such as synchrotrons), new detectors, new experimental protocols (such as reduction of radiation damage by cryo-cooling) and the development of modern software, the number of macromolecular structures has exponentially increased (Fig. E2).6,7 The bulk of macromolecular diffraction data collection has moved from in-house facilities to high flux sources, mainly at synchrotrons. Nowadays, the Cambridge Structural Database contains more than 750,000 structures of organic molecules, mostly determined by X-ray crystallography, and the Protein Data Bank (PDB) contains over 105,000 structures of biomacromolecules, almost 90% of which were determined by X-ray diffraction.8 This review is a tribute to the contributions of X-ray crystallography to clinical immunology and allergy. Notable achievements include the determination of the three-dimensional structure of allergens, antibodies, receptors, and other molecules involved in immunological processes and allergic disease. The review also highlights findings derived from structural studies, which revealed mechanisms that contribute to allergic sensitization.
The determination of an X-ray crystal structure starts with the purification of a sufficient amount of highly concentrated and homogeneous protein (Fig. 1). Several factors that contribute to molecular variability need to be taken into consideration, including molecular variants, glycosylation, proteolysis, fragmentation, aggregation, precipitation, and/or molecular flexibility (see details in Online Repository). The expression of recombinant allergens, either alone or as fusion proteins, has often proven successful for obtaining homogeneous protein preparations.9–12 Molecules naturally embedded in membranes, such as the histamine receptor, may require strategies involving protein solubilization.13 Procuring soluble, properly folded, and pure protein is the most promising way to obtain diffraction quality crystals, and is the major bottleneck in the structure determination process. The difficulty in obtaining diffraction quality crystals promotes the development of complementary technologies such as cryo-electron microscopy.14 Screening of optimal conditions for crystal growth is performed either manually or by robotics.15 Hundreds of conditions are tested by mixing the highly concentrated protein with different precipitants. A commonly used technique is based on “hanging” or “sitting” drops in multi-well plates, where crystals form by slowly concentrating the protein and the precipitant through vapor diffusion. Once crystals are obtained, the X-rays to probe them are produced either by rotating anode generators found in most cyrstallography labs or at synchrotron stations. There are over 120 stations dedicated to macromolecular crystallography experiments worldwide.16 The diffraction pattern produced by exposing protein crystals to X-rays is analyzed to obtain an electron density map that is used for determination of a molecular model.17–19 The overall quality of structural models, as measured by various parameters used for structure validation, has significantly improved over time.20 Structure quality depends not only on the resolution of the collected diffraction data, but also on the processing of these data. Resolution is the smallest distance in angstroms (Å) between two atoms that can be shown to be separated (Fig. E3). The myoglobin structure determined in 1958 had a resolution of 6 Å. Nowadays, resolutions as high as 0.48 Å can be obtained.21 Crystallography has evolved from manual determination of molecular models to the use of hardware and software that greatly enhances the efficiency and quality of data collection, data reduction, model building and refinement. These transformative developments allow rapid data processing, model construction and refinement (optimization).22,23 Sophisticated software, such as HKL-3000, has enabled the determination of structures with “speed and finesse”.22 Indeed, a process that used to take years or months is now performed in days, or in hours in optimal cases. (Figure E2, Online Repository). The productivity of synchrotron stations, measured by structures determined, varies significantly and mostly depends on experimental protocols.24 There is no correlation between beamline productivity and any aspect of physical setup of data collection hardware, beamline flux or number of crystals used for diffraction. Frequent reports show that only 1–5 minutes of data collection time are needed to generate an entire data set sufficient for a good structure determination.25 The most productive synchrotron stations still need roughly 20 hours of data collection time to produce one structure. There is still room for improvement of experimental protocols and software in the future.
Immunological processes result from the interaction of specialized molecules, whose structure defines their function. Selected examples involved in adaptive and innate immunity are shown in Fig. 2. The structure of MHC class II molecules that are expressed in vivo on the surface of antigen presenting cells clearly revealed a groove that binds the peptides generated from antigen processing and presents them to T-cell receptors (Fig. 2A). This action, together with co-stimulatory signals, leads to activation of T cells to release cytokines that interact with cytokine receptors in B cells (Fig. 2B). Crystal structures of ovalbumin (Gal d 2) peptides bound to mouse MHC class II, and a peptide from Cry j 1, a major allergen from the Japanese red cedar (Cryptomeria japonica), bound to HLA-DP5 also aid in our understanding of molecular recognition of allergens.26,27 The structural basis of the interaction of cytokines with their receptors has been investigated to better understand what induces the production of antibodies by B cells. Cytokine receptors of the interleukin-4/13 system result from the assembly of different subunits. Interestingly, a single subunit can be shared by different cytokine receptors (i.e. IL4Rα shared by IL-4/IL-4Rα/γc and IL-13/IL-4Rα/IL-13Rα1) (Fig. 2B). These molecular interactions determine mechanisms of cytokine action, and influence disease phenotype and response to treatment.28,29
IgE antibodies eventually bind to high affinity IgE receptors (FcεRI) on mast cells and basophils (Fig. 2C). X-ray crystal structures showed the flexibility of IgE-Fc (consisting of the Cε2, Cε3, and Cε4 domains). IgE-Fc adopts different conformations ranging from an acutely bent structure of the IgE-Fc when bound to the extracellular domains of the FcεRI α chain (Fig. 2C) to a diversity of conformations in solution, including a fully extended symmetrical one.30,31 The unique slow dissociation rate of IgE from FcεRI was attributed in part to these conformational changes. This observation provides a strategy for the design of asthma therapeutics using peptides that would disrupt the interaction between IgE and its high affinity IgE receptor.31
The low affinity IgE receptor (CD23, FcεRII) on B cells is a calcium-dependent molecule, with a C-type lectin domain, that contributes to the regulation of IgE levels. Calcium induces structural changes in CD23 that lead to additional interactions with IgE and a 30-fold increase in affinity for IgE (Fig. 2D),.32 The crystal structure of IgE bound to CD23 revealed a mechanism of reciprocal allosteric inhibition with the high affinity IgE receptor.33
Structures of molecules involved in innate immunity have also been determined, including the extracellular domain of Toll-like receptors (TLR). The structures of TLR differ from the “immunoglobulin-like” extracellular domains of B- and T-cell receptors mentioned above. For all these receptors, only the structures of the extracellular domains have been determined. Ten TLR are known in humans and they are formed by an N-terminal recognition domain, a single transmembrane helix, and a C-terminal signaling domain. The N-terminal domain recognizes pathogen-derived compounds or endogenous molecules released by the host in response to infection.34 This domain adopts a typical horseshoe-shape, made of tandem copies of a motif known as the leucine-rich repeat (LRR). Its concave surface has the ligand binding capacity in most LRRs. Two extracellular domains form a dimer upon ligand binding and activate signaling. The structure of the Toll-like receptor 4, for example, has revealed the importance of its interaction with MD-2 and endotoxin for signaling activation (Fig. 2E).
Immunological processes that lead to the development of allergic disease are strongly associated with structural features of the allergen (Fig. 3A–G). Allergen exposure, critical to IgE sensitization, is determined in part by the function of the allergen and the degree of molecular stability required for the protein to become an allergen, both of which are defined by the allergen structure. To access the immune system, allergens must be either released to the environment (e.g. pollens, spores, dander or fecal particles), or made accessible by other paths (ingestion of foods, injection of venoms, or contact through skin or infection). The function of the allergen can facilitate exposure in different ways. Some allergens are released to the environment because of their reproductive function in pollen or spores. Others have a digestive function and are excreted in fecal particles. Additional factors determine exposure, such as the aerodynamic properties of particles that carry inhalant allergens. One of the first functions gleaned from the X-ray crystal structure of allergens was obtained in 1992, for the major urinary proteins from mouse and rat. Both Mus m 1 and Rat n 1, which were not known as allergens at that time, are pheromone transporters, and belong to the lipocalin family of proteins. Lipocalins consist of a β-barrel with a hydrophobic ligand binding cavity (Fig. 3A) and are secreted in tears, urine, sweat, or saliva, which facilitates exposure.35 Lipocalins are common mammalian inhalant allergens, also found in cockroach (Bla g 4) and in cow’s milk (food allergen Bos d 5).
The first three-dimensional structure of a clinically important allergen was Bet v 1, a pathogenesis associated protein, from birch pollen (Fig. 3B).36 Bet v 1 is the most extensively studied allergen from pollen. The IgE prevalence in temperate climate areas of the northern hemisphere (i.e. Northern Europe), is high (>95%). Bet v 1 shows clinical cross-reactivity with homologous allergens from fruits and vegetables (i.e. apples, celery, carrot).37 A large number of variants (eighteen) have been identified in natural Bet v 1, sharing high amino acid sequence identity (~95%). Their nomenclature has recently been revised by the Allergen Nomenclature Sub-Committee from the World Health Organization and International Union of Immunological Societies (WHO/IUIS) (www.allergen.org).38,39 Bet v 1.0101 is the major component of natural Bet v 1 (>50%) and the main sensitizer, whereas other isoforms (Bet v 1.0401 and Bet v 1.1001) induce only minimal IgE antibody responses.40 Bet v 1.0101 and Bet v 1.0401 share the same fold, but differences in dimerization or aggregation could contribute to a decreased ability of the Bet v 1.0401 variant to elicit an allergic immune response. Interestingly, the fold of Bet v 1.0101 per se was demonstrated to be important for Th2 polarization and the induction of a strong IgE response, by comparison with an engineered folding variant.41
Nowadays, the Protein Data Bank (PDB) (www.rcsb.org) contains the three-dimensional structures of over 100 allergens, representing approximately 50 protein families, from approximately 800 allergens currently present in the WHO/IUIS official database of systematic Allergen Nomenclature (Tables E1 and E2, Online Repository). A database of allergen families (http://www.meduniwien.ac.at/allergens/allfam/) reports that allergens belong to a wide array of 186 protein families.42,43 The most frequent biochemical functions of allergens are hydrolysis of proteins, carbohydrate metabolism, binding of metal ions and lipids, storage and functions associated with the cytoskeleton42. Measurement of biological activity using a specific functional assay is the best way to confirm function, but this option is not always available. Sequence homology to a protein of known function may also be insufficient.44 Determination of the tertiary structure can delineate allergen function by defining the overall shape of the molecule and/or revealing specific functional residues.45 The major dust mite allergens, Der p 1 and Der f 1, are cysteine proteases. Their catalytic site has been identified in the three-dimensional structures of the native allergens (Fig 3F).46,47 In contrast, cockroach allergen Bla g 2 folds as a typical pepsin-like aspartic protease, but is devoid of aspartic protease activity due to specific substitutions in the catalytic site that were revealed by the crystal structure (Fig. 3E, E1A).48,49 In other cases, molecular flexibility confers a regulatory function, as seen in calcium-binding allergens commonly found in pollen and in troponin, which regulates muscle contraction (i.e. Bla g 6).50 The first three-dimensional structure of a representative of the 2 EF-hand allergen family was reported for Phl p 7 bound to calcium.51 Differences observed in IgE antibody binding depending on the allergen conformation suggest that conditions and conformation with optimal IgE reactivity should be selected for molecular diagnosis.
The storage peanut proteins Ara h 1 and Ara h 2 are food allergens with very different structural complexity. Ara h 1 is a trimer of 60 kDa bicupin-fold monomers, whereas Ara h 2 is a small α-helical protein, which is monomeric (17 kDa) (Fig. E4).11,52 It has been suggested that the quaternary structure of Ara h 1 may play a role in its allergenicity, by increasing molecular stability and preventing digestion of IgE antibody binding epitopes.53 However, there is no evidence that the differing tertiary structure of both allergens is responsible for differing allergenic potential. In fact, IgE antibody titers to Ara h 2, with a simpler structure, have been reported as the best predictor of peanut allergy.54–56 In general, three-dimensional structural complexity of allergens does not seem to be related to their allergenicity.
Finally, determination of the X-ray crystal structure of allergens has revealed the existence of new structural groups of proteins. Alt a 1, the major allergen from Alternaria alternata has a unique β-barrel structure that forms a “butterfly-like” dimer and is exclusively found in the Dothideomycetes and Sordariomycetes classes of fungi (Fig. 3C).57 Cockroach allergen Bla g 1 has an α-helical structure so far only found in insects (Fig. 3D).12,58 While the three-dimensional structures of these and other allergens have been determined, their functions still are not well understood.10,57,59 Cat allergen Fel d 1 has an uteroglobulin-like fold and consists of a dimer of heterodimers made exclusively of α-helices. Structures of recombinant Fel d 1 made by fusion of the monomers involved in each heterodimer have been determined.60,61 However, the native assembly of this major cat allergen remains unknown.
An interesting example of how an X-ray crystal structure revealed a function associated with allergenicity is illustrated by Der p 2. This dust mite allergen resembles MD-2, the lipopolysaccharide (LPS)-binding component of the Toll-like receptor (TLR) 4 signaling complex, which is involved in activation of the innate immune system. Both proteins have an immunoglobulin-like fold, formed only of β-sheets (Fig. E1B, E4A).62,63 The surprisingly high structural similarity prompted an investigation as to whether Der p 2 would have a similar function. Der p 2 was not only able to mimic the function of MD-2 in mouse models of experimental asthma, but was also able to reconstitute LPS-driven TLR4 signaling in the absence of MD-2, suggesting a possible role of Der p 2 in activation of innate immunity.64 Results obtained with Der p 2 are supported by data demonstrating that low-level stimulation of toll-like receptors drives Th2 immune responses.64–68 This discovery marked an important step in our understanding of possible innate immune pathways that lead to allergic sensitization. Until then, activation of the adaptive immune system was the main recognized path for developing allergy. Proteolytic function had been considered to contribute to allergenicity, by cleaving molecules involved in the immune response.69,70 Der p 1 can contribute to allergenicity via non-canonical pathways. Der p 1 was reported to directly promote IgE synthesis through cleavage of the low affinity IgE receptor (CD23) on B cells, and, indirectly, through cleavage of the α-subunit of the interleukin-2 (IL-2) receptor (CD25) in T cells.71,72 Der p 1 can also contribute to inflammation by inducing cytokine production and disrupting gap junctions in the lung epithelium, which would increase membrane permeability and facilitate transepithelial allergen delivery and processing.69,70,73,74
Structural studies have shown that an increasing number of allergens, belonging to different structural families, bind hydrophobic ligands, and these are potent stimulators of the innate immune system.65 The crystal structure of Bla g 1 provided the first clues that this protein might be a lipid-binding protein (Fig. 3D). Lipids in cockroach frass were identified as fatty acids that are known to activate TLR2 and TLR4.75 Some allergens contain lipids in internal cavities that are formed by either β-sheets (Der p 2, lipocalins) or α-helices (Fel d 1, Bla g 1) (Figures E1B, 3A, D).12,61,62,76 Other allergens, lacking internal cavities, may also bind lipidic ligands in different ways. Der p 5 consists of three-helical bundles that tend to form multimers (Fig. E4B). A large hydrophobic cavity formed by each dimer has the potential to bind lipidic ligands.10 Der p 7 has a similar structure to a LPS-binding protein involved in TLR4 activation, and to the surfactant allergen Equ c 4.9,77 The fold consists of two 4-stranded antiparallel β-sheets that wrap around a long C-terminal helix (Fig. E4C).9 Although unable to bind LPS, Der p 7 bound the lipopeptide polymyxin B.9 The specific contribution of allergen-associated lipids to allergenicity has been recently reviewed.78 The interactions of these lipids with innate immunity are open to further investigation.
The allergen structures determined by X-ray crystallography provide the basis for improved diagnosis and therapy. An undesired side-effect of immunotherapy is the induction of adverse reactions that may occur when administering increasing doses of the allergen during vaccination. To avoid these effects, hypoallergens with reduced IgE reactivity that preserve their capacity to induce T cell responses have been designed as candidates for vaccination. One strategy is the disruption of the overall allergen fold. This has proven efficacious with hypoallergenic chemically modified extracts (allergoids) that are successful for immunotherapy in Europe.79 Numerous variants of recombinant hypoallergens have also been designed, but only Bet v 1 variants have reached clinical trials where they have already shown promising results.80 Recently, a dose-ranging immunotherapy study of a new recombinant hypoallergenic folding variant of Bet v 1 showed efficacy in an environmental exposure chamber.81 Although knowledge of the three-dimensional structure of the allergen is not always necessary for the production of unfolded variants, the disruption of the overall allergen fold can be effectively designed based on the modification of specific structural features. For example, mite group 2 hypoallergens were produced by mutating cysteines involved in the formation of disulfide bonds that preserve the immunoglobulin-like structure.82,83 Another strategy to produce hypoallergens consists of modifying residues involved in IgE antibody binding, without affecting the fold of the allergen. In this case, knowledge of the three-dimensional structure of the allergen is required. The best way to precisely locate the epitope recognized by an antibody is the determination of the structure of the allergen in complex with the antibody. For example, the structure of Bet v 1 in complex with the Fab fragment of a monoclonal antibody (mAb) that interfered with IgE antibody binding identified a dominant epitope, also involved in IgE cross-reactivity with homologous allergens.84,85 The X-ray crystal structures of additional complexes revealed determinants of specificity and cross-reactivity.47,86–91 Bla g 2 and Der p 1 have been extensively analyzed through the structures of two and four allergen-antibody complexes, respectively (Fig. 3E, F).47,89–91 The structural basis of cross-reactivity was analyzed for mAb 4C1, which binds Der p 1 and Der f 1 in equivalent epitopes (Fig. 3F).47 It is not possible to obtain the mg amounts of pure native IgE antibodies required for X-ray crystallography, given their polyclonality and their low concentration in sera compared with IgG antibodies (0.05%). Therefore, recombinant IgE mAb, derived from combinatorial libraries from allergic patients, were used in complexes with bovine milk Bos d 5 (β-lactoglobulin) (Fig. 3G) and timothy grass pollen Phl p 2.87,88 This strategy, combined with site-directed mutagenesis, is a powerful tool to identify the main residues involved in IgE antibody recognition and to produce hypoallergens.47,85,92
Molecular allergy diagnosis has shown to improve patient diagnosis compared to exclusively using clinical history and skin prick test with allergen extracts.37,93–97 Structural analyses reveal relationships between homologous allergens and contribute to the design of panels of purified allergens for molecular diagnosis. Although a general rule suggests that cross-reactivity is likely among proteins that share a high degree of amino acid identity throughout the entire protein (>70%), and tends to be rare below 50% identity, exceptions do occur.98–102 Predictions of cross-reactivity based on the overall homology among allergens can only be used as guidelines, but lack precision. The three-dimensional structure of allergens allows identification of solvent exposed residues responsible for antibody recognition.103 Ideally, the selection of representative molecules from highly cross-reactive groups of allergens, and the inclusion of allergens with species-specific epitopes can simplify molecular diagnosis.104,105
In the future, an increase in structure-functional studies is expected by combining methodologies from disciplines of medicine and structural biology. Cryo Electron Microscopy will enable study of larger complexes of antigens and antibodies without the need to crystallize them. Utilization of Free Electron Laser as an X-ray source will allow investigations using nanocrystals and lower the chances of structural changes due to radiation damage.106,107 NMR, currently only rarely used for structural studies of antigen-antibody complexes, is expected to have a higher impact on ligand screening and dynamic studies. As more structures become available, the ability to create informative homology models for hypothesis driven research will be enhanced. Tighter interactions between functional and structural studies are expected to impact more significantly the drug discovery process. The creation of a ‘big picture’ and better reproducibility will be possible by sophisticated database systems that will organize and analyze structural data in biomedical laboratories.108
X-ray crystallography has led to major contributions in clinical immunology and allergy. The three-dimensional structures of molecules involved in immunological processes and of allergens, alone or in complex with antibodies, have provided detailed information at the atomic level that reveals mechanisms of molecular interaction. Recombinant allergens are engineered to either preserve their native fold and amino acid composition, or to have specific residues and/or the overall fold modified for reduced IgE reactivity, which may diminish side-effects due to increasing allergen doses administered during therapy. Recombinant allergens have already shown promising results in immunotherapy clinical trials. The structural information is being used to increase the accuracy of diagnosis and to design new forms of immunotherapy that will complement and improve current approaches to treat allergic disease.
Research reported in this publication was supported by the National Institute of Allergy and Infectious Diseases of the National Institutes of Health under Award Number R01AI077653 (A.P. contact PI); by the Intramural Research Program of the NIH, National Cancer Institute, Center for Cancer Research; and by the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences Research Project Number Z01-ES050147 (G.A.M) and ZIA ES102645 (L.C.P).
The authors thank Jill Glesner for her assistance in preparation of tables, and Dr. Susanna Keller for providing inspiring literature.
The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.