|Home | About | Journals | Submit | Contact Us | Français|
Summary: Following their discovery in the early 1970s, classical human leukocyte antigen (HLA) loci have been the prototypical candidates for genetic susceptibility to infectious disease. Indeed, the original hypothesis for the extreme variability observed at HLA loci (H-2 in mice) was the major selective pressure from infectious diseases. Now that both the human genome and the molecular basis of innate and acquired immunity are understood in greater detail, do the classical HLA loci still stand out as major genes that determine susceptibility to infectious disease? This review looks afresh at the evidence supporting a role for classical HLA loci in susceptibility to infectious disease, examines the limitations of data reported to date, and discusses current advances in methodology and technology that will potentially lead to greater understanding of their role in infectious diseases in the future.
Infection is one of the leading causes of human mortality and morbidity, with much of the burden falling on children (27). Infectious diseases are a major selective pressure (60, 147, 161, 171), and genes involved in the immune response are the most numerous and the most diverse in the human genome (115), reflecting the evolutionary advantages of a diverse immunological response to a wide range of infectious pathogens (14). Following their discovery in the 1970s (159), classical human leukocyte antigen (HLA) loci have stood out as the leading candidates for infectious disease susceptibility. The original hypothesis to account for the extreme variability observed at classical HLA (H-2 in mice) loci, proposed by Zinkernagel and Doherty (52, 181, 182), was that the major selective force was from infectious diseases, particularly viral infections. Susceptibility to infection and many other human diseases (including diabetes and ischemic heart disease) arises through the complex interaction of environmental and host genetic factors. In general, many genetic loci make modest contributions to human disease susceptibility (i.e., these diseases are genetically “complex”), and much of the recent focus in the field has been on the identification of functional variants at these loci and their effects in infection and in other conditions. Now that the complexity of the human genome is understood in greater detail and the molecular basis of innate and acquired immunity is becoming clearer, it is reasonable to ask whether the classical HLA loci still stand out as major genes that determine susceptibility to infectious disease.
Table Table11 and Tables S1 to S8 in the supplemental material indicate that there are currently many more papers reporting positive genetic associations between classical HLA loci and major infectious diseases (human immunodeficiency virus [HIV]/AIDS, hepatitis, leprosy, tuberculosis, malaria, leishmaniasis, and schistosomiasis) than reporting negative results. However, there are many caveats to the interpretation of these data, not the least of which is publication bias (more positive results than negative results tend to be accepted for publication), heterogeneity in clinical phenotypes and populations studied, and small sample sizes. Here we put some of these infectious disease studies into the context of the postgenomic era, concentrating on examples of viral (HIV/AIDS and hepatitis B and C viruses), bacterial (leprosy and tuberculosis), and parasitic (malaria, leishmaniasis, and schistosomiasis) infections, for which large numbers of studies indicate that HLA association may be real (Table (Table1)1) and among which interesting functional comparisons can be made. We also consider the influence that classical HLA molecules have on vaccine design and in determining responses to vaccines and some therapies against infection. Finally, we suggest some guidelines for future studies that are necessary to evaluate more completely the role of HLA in infectious disease.
The classical HLA loci are the class I (HLA-A, -B, -C, -E, -F, and -G) and class II (HLA-DR, -DQ, -DM, and -DP) molecules identified for their role in presentation of antigen to CD8+ and CD4+ T cells, respectively. They are encoded by a 4-Mb region of human chromosome 6p21 that is recognized as the most variable region in the human genome (72) and which has been designated the major histocompatibility complex (MHC) because of the important role played by class I and class II molecules in recognition of self versus nonself. Genes within the MHC play key roles in transplant immunity, as well as in susceptibility to a large range of diseases (see, for example, Table Table11 in reference 88), including many autoimmune and infectious diseases. While their role in presentation of antigen to T cells initially set these molecules firmly as initiating (effectors) and maintaining (memory) acquired T-cell immunity, it became apparent that class I molecules could also be recognized by killer cell immunoglobulin-like receptor (KIR) molecules on the surface of natural killer (NK) cells (88). This created an additional role for classical HLA molecules in driving the innate immune response. Engagement of KIRs by class I molecules provides positive or negative signals to natural killer cells through, respectively, immunoreceptor tyrosine-based activating motifs (ITAM) or immunoreceptor tyrosine-based inhibitory motifs (ITIM) that reside within KIR molecules. Although HLA and KIRs are located on different chromosomes and are therefore inherited independently, both are highly polymorphic, and there is great potential for certain combinations of class I and KIR alleles to result in beneficial or deleterious interactions. Such gene-gene interaction is known as epistasis (88). As more details of the MHC emerged, many other genes whose products regulate aspects of innate immunity, notably complement factors C2 and C4B, tumor necrosis factor (TNF) (gene TNF), and lymphotoxin-α (gene LTA), were also found to lie within the MHC. This led to subdivision of the MHC into class I, class III, and class II regions (moving from telomere to centromere on 6p21 [Fig. [Fig.1]).1]). The first complete sequence and gene map of the MHC derived from a single homozygous individual was published in 1999 (119a). The latest annotations based on sequence analysis across eight different homozygous MHC haplotypes (72) identified >44,000 variations, both single-nucleotide polymorphisms (SNPs) and insertion/deletions (indels), and confirmed the presence of >300 loci, including >160 protein-coding genes (Fig. (Fig.1).1). Across the region, the average SNP density varies from 1 to >60 SNPs per kb; the highly polymorphic regions are mainly in the class I and class II genes.
The location of genes encoding classical HLA molecules in close proximity to so many other immune-related genes poses huge problems in trying to tease out the critical genetic/functional variants that might play a role in complex diseases such as autoimmune and infectious diseases. Hence, although numerous studies have demonstrated an association between HLA alleles and disease susceptibility, interpretation of the data is confounded by the strong correlation (linkage disequilibrium [LD] [see glossary of terms in the supplemental material; for additional useful information, see references 30, 32, 103, and 104]) between alleles at neighboring HLA and non-HLA genes. Recognizing this problem, the MHC Haplotype Consortium was formed in 2000 in an attempt to determine fully informative polymorphism and haplotype maps and to make these publicly available as a resource for the study of MHC-linked diseases (http://www.sanger.ac.uk/HGP/Chr6/MHC/). Incremental data and tools released (6, 73, 119a, 155, 166) have contributed toward the construction of a high-resolution LD map and a first generation of HLA tag SNPs (see glossary of terms in the supplemental material) (48). One objective was the possibility of replacing traditional HLA typing techniques with tag SNPs, as high levels of LD between HLA alleles and SNPs/indels in non-HLA genes may make the latter informative about the former, an approach used successfully elsewhere in the genome (12, 40, 63). However, noticeable differences occur between the reference populations, i.e., African (YRI), European (CEU), Chinese (CHB), and Japanese (JPT). The inferred haplotype structure across the region shows that LD is systematically higher in CEU, CHB, and JPT samples than in YRI samples. However, this generalization is not true of all specific HLA alleles; for example, the HLA-C*0702 allele has many SNPs in moderate to strong LD in YRI and CEU extending over several Mb, whereas in CHB and JPT, strongly associated SNPs are only found within 50 kb of the gene. In contrast, HLA-C*0304 is not strongly associated with any single SNP in any of the four populations. This means that although tagging of certain common HLA alleles may be straightforward in some populations, tag SNPs will differ between populations, and pairwise associations to a single tag SNP are unlikely to be sufficiently informative. Instead, haplotypes of multiple SNPs are needed to tag a specific HLA allele. In practice, genotyping of a large number of SNPs across the MHC, for example, using SNP chip technology, is likely to provide useful information to help localize disease susceptibility loci within the MHC. However, this will not necessarily provide sufficient information to make an association with a particular functional variant in an HLA molecule itself. This means that uncovering the molecular basis of disease for classical class I and class II loci, i.e., the specific (single) amino acid difference(s) that may alter antigen recognition (15, 49, 179), will almost certainly require specific sequence-based typing of specific HLA alleles. This poses the additional problem of sequence-based genotyping in large numbers of individuals, which may be facilitated by the development of software (available at https://sourceforge.net/projects/skdm/) that can test for HLA allele and amino acid differences between two populations (84). The software examines zygosity and tests for strongest association, interaction, and LD among amino acid epitopes. This and other in silico solutions are likely to improve standardization between studies and may provide the basis for future detailed analysis of HLA and disease.
The extreme diversity of classical HLA loci and their proximity to, and LD with, a large number of other genes that play a role in driving innate immunity make it difficult to definitively pinpoint the etiological genetic variants that control susceptibility to complex disease. To highlight these issues, we examine first the role of the MHC as a major locus controlling the complex disease type 1 diabetes (T1D), for which there is good evidence of significant heritability (162). For T1D there is a long history of genetic studies, culminating in the accumulation of large cohorts with sufficient statistical power to dissect out the role(s) of classical HLA loci (130). As for infectious diseases (Table (Table1;1; see Tables S1 to S8 in the supplemental material), analysis of genetic susceptibility to T1D is peppered with a large number of studies of classical HLA molecules and closely associated genes undertaken on small numbers of cases and controls, or case-parent trios, resulting in controversies over which molecules encoded within the region might influence disease (see, e.g., references 149 and 150). As higher-throughput technologies became available, researchers could realistically consider scanning the genome for novel genes controlling susceptibility to complex diseases. Genome-wide studies have the advantage that no supposition is made regarding the genes involved, and potentially novel or unconsidered genes may be identified. T1D was the first disease for which a genome-wide linkage scan (GWLS) was undertaken (47, 119, 162). Microsatellite markers at 10- to 20-cM intervals were genotyped across the genome in 96 affected sibling pairs plus parents, with fine mapping and replication studies carried out in further samples of 102 and 84 affected sibling pair families. The HLA region stood out as the major gene for T1D susceptibility, with a combined logarithm-of-the-odds (LOD) score (see glossary of terms in the supplemental material; for additional useful information, see references 30, 32, 103, and 104) equal to 19.3, which dwarfed that of any other loci, for which the maximum LOD score was 2.1. The HLA locus is associated with a relative risk to siblings (lambda S) of 2.4 (163), accounts for about 35% of the observed familial clustering (44), and is conclusively the major locus controlling T1D susceptibility. This conclusion was born out in a recent prototypic genome-wide association study (GWAS) that genotyped 500,000 SNP variants (using the Affymetrix GeneChip 500K Mapping Array Set) in 2,000 T1D patients compared to 3,000 controls (Wellcome Trust Case Control Consortium [WTCCC]) [175a; http://www.wtccc.org.uk/info/overview.shtml]). The HLA region again stood out as the major locus, with a P value of 4.0 × 10−116, compared to P values in the range of 1.3 × 10−3 to 1.2 × 10−26 for the next most significantly associated genes reported. Even with this large sample it was not possible to dissect out the role of multiple loci within the region. Following the WTCCC T1D study, a combined total of 1,729 polymorphisms across HLA were genotyped in >7,000 patients and >7,000 controls across cohorts, and statistical methods of recursive partitioning and regression were applied to pinpoint disease susceptibility to the MHC class I genes HLA-B and HLA-A (risk ratios of >1.5; Pcombined = 2.01 × 10−19 and 2.35 × 10−13, respectively), in addition to the established associations of the MHC class II HLA-DQB1 (P = 10−117) and HLA-DRB1 (P = 10−124) genes (130). Due to the complex multiallelic effects of the highly disease-associated class II genes, simply conditioning on the class II genotypes to analyze class I gene effects was unreliable and gave variable results depending on grouping of the class II alleles (130). The use of very large sample sizes and a classification tree approach (recursive partitioning and regression) provided reliable results that make this study superior to previous reports of class I associations. This allowed the authors to form a tree of known disease risk-associated class II genes, upon which secondary associations were tested. This innovative analytical approach has been heralded as an important development in statistical methodology required to tease out genetic associations of complex diseases with multiple genes in immune response gene clusters such as HLA and KIR genes (143).
Given that studies of infectious disease lag somewhat behind those of the other major complex diseases in application of GWAS in large cohorts with adequate statistical power, what can we learn from the T1D story that might assist in evaluating the large number of small HLA studies that have been undertaken for infectious diseases (Table (Table1)?1)? First, given the selection bias in deciding which locus/loci to type, variation in typing methods, small sample size, and genetic heterogeneity in the range of populations that have been studied, it is unlikely that any definitive conclusions can yet be drawn from the large number of the studies (see Tables S1 to S8 in the supplemental material; summarized in Table Table1),1), and no attempt is made to do so in this review. These factors also preclude formal meta-analysis for most disease studies presented in these tables. One possible lead that can be taken from T1D is to examine the data available from GWLSs that have been undertaken for some infectious diseases and to determine where HLA sits within the hierarchy of novel loci that have been identified using this method. Very few GWASs have been undertaken for infectious diseases to date, but where available, a similar question can be addressed. Finally, those studies where a direct functional link between a specific HLA allele and disease susceptibility has been made merit further discussion.
Individuals in whom all class II HLA alleles are heterozygous are more likely to clear hepatitis B virus infection (160), and those with heterozygous class I alleles progress from HIV to an AIDS-defining illness more slowly and have lower mortality (34). The converse scenario, increased HLA homozygosity, may contribute to the increased susceptibility to infection in genetically isolated populations (27). These observations in and of themselves point to a major role for classical HLA loci in determining susceptibility to viral infections. This is borne out by association signals obtained using a GWAS approach (56), albeit not with the same level of statistical significance for reduction in viral load in HIV/AIDS (P = 9.36 × 10−12) as that observed for T1D (P = 4.0 × 10−116). Nevertheless, the most convincing associations between classical class I and class II HLA alleles and infectious diseases have been reported for chronic viral infections, reflecting their global importance and, presumably, the logistics of recruiting enough subjects to obtain the necessary sample sizes of sufficient power. HIV type 1 (HIV-1) and hepatitis B and C virus infections have been most extensively investigated, with a number of studies having adequate power to allow consistent conclusions to be drawn. As in all association studies, especially those of infectious diseases, the definition of the clinical phenotype is critical. For example, for HIV infection various genetic factors, including those within the MHC, have been associated with overall susceptibility to infection, the risk of vertical transmission from mother to child, the risk of horizontal transmission between HIV-discordant couples, the rate of progression to an AIDS-defining illness once infected, and the risk of severe reactions to specific anti-HIV treatment (see Table S1 in the supplemental material) (96, 137).
In contrast to the case for many bacterial and parasitic infections, an effective host response against viruses that breach early innate defenses relies heavily on HLA-restricted T-cell responses, with effective presentation of viral epitopes by dendritic cells to CD8+ T lymphocytes through class I HLA and to CD4+ T lymphocytes through class II HLA. HLA class I presentation leads to the clonal expansion of HLA-restricted CD8+ cytotoxic T lymphocytes (CTL), and the CTL response is central to antiviral defense during acute infection. Memory CTL are involved in the immune response to latent reinfection and reactivation (170). CD4+ T lymphocytes augment CTL responses and provide T-cell help to the generation of specific antiviral antibodies.
The HIVs (HIV-1 and HIV-2) are relatively recent human pathogens that have evolved from lentiviruses of African primates and continue to evolve rapidly, frustrating attempts to develop protective vaccines (65). The HIV and HLA data are among the most convincing and intriguing in human infection and probably offer the most immediate prospect of harnessing genetic association studies to develop novel interventions. In addition to a number of associations between HLA and clinical phenotypes and response to treatment, these data also demonstrate the profound effects of HLA restriction on HIV variants that elude immunological control. The HIV/HLA studies also provide mechanistic insights into the interaction of HLA molecules and the NK cell receptors.
HIV infects CD4+ T cells using a number of cofactors to gain entry into host cells. A functional polymorphism resulting in a deletion of a key cofactor, the chemokine receptor CCR5, results in protection against HIV infection in homozygotes and slower disease progression in heterozygotes (111, 115). However, once HIV has evaded innate defenses, control of HIV infection relies on HLA-restricted CTL responses, which exert a strong inhibitory effect on viral replication and growth (117). There is therefore strong evolutionary pressure for the development of HIV variants (termed “escape mutants”) that are able to evade the CTL response (33). Variation in the immunologically dominant epitopes of HIV is one of the major barriers to the development of an HIV vaccine, especially vaccines that rely on protective CTL responses (19). Selection for amino acid sequence variants at HIV epitopes may lead to immunological escape by a variety of mechanisms: interfering with epitope-HLA binding, reducing T-cell receptor recognition, or generating antagonistic CTL responses (128). This CTL-driven selection of escape mutants occurs rapidly in simian models of HIV (55). In human populations, amino acid sequence variants of the HIV pol protein have been shown to be partly dependent on the HLA class I alleles of the infected individuals (128). The degree of HLA-associated selection of HIV sequence is predictive of HIV viral load, a marker of disease progression (24, 128).
Immunological escape by HIV is, however, constrained by the functional or structural implications of variation of a particular epitope for viral survival (117). Individuals with HLA alleles that preferentially select regions of HIV proteins that less readily mutate, because of their effects on viral fitness, would therefore be expected to have better outcomes when infected with HIV, as the potential for the development of escape mutants is reduced (65). This is illustrated by HLA-B*27, which recognizes a conserved epitope from the p24 HIV capsid protein and is associated with significantly improved survival in HIV-infected individuals (who are termed “elite suppressors”). These individuals may maintain high CD4 counts and low-level viremia without antiretroviral therapy (17). Interestingly, non-HLA-matched individuals infected with HIV escape variants that appear to have been selected by previous HLA-B*27 restriction also show lower viral load and greater CD4 counts, indicating that HLA restriction of escape mutants that comes at a cost to viral fitness may shape viral evolution and attenuate viral survival (38). Conversely, certain other HLA class I types are associated with rapid disease progression in HIV. HLA-B*35-restricted escape mutants may affect the recognition of CTL by reducing both peptide binding and T-cell receptor recognition, and HLA-B*35 subtypes are associated with rapid HIV disease progression (164). A recently published GWAS of HIV in nonprogressors has highlighted the complex putative roles of multiple genes within the MHC, including HLA (100).
HLA class I alleles also interact with NK cells, which are central components of innate defense against virally infected cells. Regulation of NK cell activity is complex and mediated in part by inhibitory and activating signals through KIRs. Specific KIRs interact with specific HLA class I ligands, and if the KIR contains an activating allotype, this results in increased NK cell effector function, with release of perforin, granzyme, and gamma interferon (167). Studies of HIV infection suggest that the protective effects of the HLA-Bw4 cluster, which contains HLA-B*27, may be partly due to epistatic (gene-gene) interaction between HLA and KIRs. An activating KIR allele (KIR3DS1), in combination with a subset of the HLA-Bw4 cluster (HLA-Bw4Ile80), is associated with delayed progression to AIDS. If KIR3DS1 is absent, the HLA-B allele is not protective, and if the specific HLA-B alleles are absent, KIR3DS1 is associated with more rapid progression to AIDS-defining illnesses (112, 113). The compound HLA-B/KIR genotype, which presumably results in enhanced NK cell activation, appears to have effects both early in HIV infection (by containing viral load) and later (by reducing the risk of opportunistic infections but not of HIV-related malignancy) (139). Despite these intriguing data, the implications of HLA/KIR epistasis in HIV are controversial (18). Overall it seems most plausible that HLA exerts both independent and epistatic effects (with KIRs) on HIV infection (62). Understanding and exploiting these effects may ultimately lead to the development of HIV vaccines and strategies to augment host defense against infection.
Chronic viral hepatitis is a major public health concern. Hepatitis B and C virus infections are estimated to account for 70% of the global burden of liver disease (152). The clinical outcomes following infection with both hepatitis B and C viruses vary considerably, from clearance of infection to chronic viral persistence, cirrhosis, and hepatocellular carcinoma. Numerous studies confirm that genetic factors play a central role in determining clinical phenotype, and the vast majority of these studies have focused on the role of the genes in the HLA/MHC region.
Only one genome-wide linkage study for hepatitis B susceptibility has been reported to date (61). This detected no evidence of linkage at the MHC/HLA region, possibly due to lack of statistical power. However, a number of candidate gene association studies have identified HLA-specific associations for both hepatitis B and C susceptibility and outcome. An HLA class II heterozygote advantage has been demonstrated for clearance of hepatitis B virus infection (160) and for progression to end-stage liver disease in hepatitis C (74). In hepatitis B virus infection, significantly fewer persistently infected Gambian individuals were heterozygous for haplotypes across HLA-DR-DQ (odds ratio [OR] = 0.53; 95% confidence interval [CI] = 0.33 to 0.84; P = 0.004) (146), whereas for hepatitis C virus infection increased heterozygosity at HLA-DRB1 supertypes is observed in uninfected versus infected liver transplant recipients (OR = 1.34; P = 1.05 × 10−6) (68). Such data suggest that HLA heterozygosity is associated with a more favorable outcome of infection, presumably due to the greater breadth of response to viral peptides that HLA heterozygosity allows (64).
Specific HLA associations with hepatitis B and C virus infection are described for viral persistence/clearance, disease progression, the risk of vertical transmission, vaccine responses (discussed below), and the response to antiviral therapies (see Table S2 in the supplemental material; reviewed in reference 152). A number of important themes emerge from these data, in addition to the issues regarding sample size and statistical analysis that affect many genetic association studies (27). The HLA associations with hepatitis B and C virus infection are largely inconsistent across different ethnic groups, and few HLA associations are shared between the two infections, which may reflect both methodological issues and different disease mechanisms (152). HLA-DRB1*07 is associated with viral persistence in both hepatitis C and B virus infection in various European and Asian populations (39, 89, 174) and also with hepatitis B vaccine failure in multiple populations (13, 114, 140, 152). A formal meta-analysis of HLA class II associations with HCV clearance provides summary estimates for the positive effects of HLA-DQB1*0301 (OR = 2.36; 95% CI = 1.62 to 3.43; P < 0.00001) and HLA-DRB1*1101 (OR = 2.02; 95% CI = 1.56 to 2.62; P < 0.00001) on spontaneous viral clearance (71). No formal meta-analysis has been performed for hepatitis B virus infection. However, HLA-DRB1*1301 (encoding HLA-DR13) appears to be consistently associated with hepatitis B virus clearance across a number of diverse populations (2, 69, 75, 93). Ultimately, it is hoped that definition of HLA-restricted viral epitopes associated with clearance will lead to improved vaccine candidates (in the case of hepatitis B) and the development of an effective vaccine for hepatitis C.
As early as the 18th century, differential susceptibility to infection was suggested to be a characteristic of the host, and diseases such as tuberculosis and leprosy were believed to be inherited defects (5). A number of GWLSs have been reported for tuberculosis (20, 121) and leprosy (121, 123, 151), each of which has been sufficiently powered to detect single major gene or oligogenic (i.e., a small number of major loci) determinants. For leprosy, the major locus identified in a GWLS in Vietnamese subjects (LOD score, 4.31; P = 5 × 10−6) (123) led to the exciting identification of the Parkinson's disease genes PARK2/PACRG as the etiological genes at the peak of linkage on chromosome 6q25 in Vietnam, with replication in a Brazilian population (122). A second signal in this genome-wide analysis of leprosy in Vietnam was observed at HLA (LOD score, 2.62; P < 2.5 × 10−4) (123). In Brazil, the major signal on a primary GWLS was at HLA (LOD score, 3.23, P = 5.8 × 10−5) (121), with combined segregation and linkage analysis in a larger data set achieving a LOD score of 5.78 (P = 2.5 × 10−7), suggesting a major influence of HLA in this population. In contrast, no signal at the HLA/MHC region was observed following a GWLS undertaken for tuberculosis in the same Brazilian population (121) or for leprosy in India (151). These data raise the question as to whether the same or different genes influence these two mycobacterial infections or whether there are different clinical subtypes within leprosy which may vary in frequency across geographical sites. Certainly, a number of other candidate genes (SLC11A1, VDR, IL-10, IL12RB, TLR2) have been shown to influence both tuberculosis and leprosy (see, for example, the online tables associated with reference 27). A GWLS of tuberculosis undertaken in Africa (20) also failed to detect a significant linkage signal at HLA. Overall, these genome-wide analyses suggest a more prominent role for HLA in determining susceptibility to leprosy than for tuberculosis. However, from the late 1970s onwards a large number of HLA association studies, albeit largely underpowered, have recorded an equivalently high ratio of positive associations compared to negative reports (publication bias notwithstanding) for both tuberculosis and leprosy (Table (Table1;1; see Tables S4 and S5 in the supplemental material). These data raise important issues in relation to the current understanding of the immunology and pathogenesis of the two most important mycobacterial infections, which are examined in more detail below.
Leprosy is caused by Mycobacterium leprae and is considered a genetic model for susceptibility to common infectious disease (11). There is a spectrum of clinical phenotypes that span tuberculoid through borderline to lepromatous forms of disease (142). These relate to individual differences in the immune response to the pathogen that were initially shown to follow the T helper 1/T helper 2 (Th1/Th2) paradigm (124, 126). Polar tuberculoid or paucibacillary leprosy patients have a limited number of hypopigmented and anesthetic skin lesions with few bacilli associated with gamma interferon/interleukin-2 (IL-2) CD4+ Th1 responses. Polar lepromatous or multibacillary patients have numerous sensitive or anesthetic skin lesions with high bacterial loads associated with IL-4/IL-10 CD4+ Th2 immune responses. Borderline forms of the disease, including borderline tuberculoid, borderline, and borderline lepromatous, present with intermediate clinical phenotypes with mixed Th1/Th2 immune responses that are not stable, and patients often move within the clinical spectrum through time. These characteristics of leprosy suggest that associations with classical HLA class II molecules could be the major genetic determinants of disease phenotype. This is consistent with the large number of studies that have shown associations between DR and/or DQ alleles and different clinical subtypes of leprosy or with leprosy per se (Table (Table1;1; see Table S4 in the supplemental material). On the other hand, the prior hypothesis that this may be the case may simply have led to more studies of classical HLA-DR/DQ class II molecules being undertaken. Historically, studies were likely to be undertaken using the HLA typing techniques available to researchers at the time. This is the most likely reason for many early studies examining HLA class I polymorphisms (see Table S4 in the supplemental material), rather than being driven by a strong hypothesis that presentation of antigen to CD8+ T cells could be important. Nevertheless, positive associations with a variety of HLA class I alleles have been observed in a number of studies (Table (Table1;1; see Table S4 in the supplemental material), perhaps reflecting a role for CD8+ T cells in gamma interferon production in response to leprosy antigens (51, 148). Both CD8+ and CD4+ T cells are present in skin lesions of tuberculoid patients (126), with CD4+ T cells among the aggregates of mononuclear phagocytes and CD8+ T cells predominantly in the lymphocytic mantle surrounding these cell aggregates. Intriguingly, the ratio of CD8+ to CD4+ T cells in the lesions of lepromatous leprosy patients is 2:1 (125, 126), which does not fit with the concept of a role for CD8+ T cells in producing gamma interferon in these patients. Instead, this observation was originally interpreted as a bias toward suppressor/cytotoxic T cells compared to helper T cells in multibacillary lepromatous patients (125, 126). This could, in turn, relate to recent studies suggesting an association of different combinations of activating and inhibitory KIR alleles and HLA class I ligands with lepromatous versus tuberculoid disease (58), although further work is required to tease out the functional roles of NK cells compared to γδ T-cell or αβ CD8+ T-cell populations. Associations have also been observed between tuberculoid leprosy and the highly variable HLA/MHC class I chain-related genes A and B (MICA and MICB), which lie between HLA-B and the TNF locus (165). MICA and MICB are also recognized by γδ T cells, αβ CD8+ T cells, and NK cells. The ratio of CD8+ to CD4+ T cells is reversed in lepromatous patients with erythema nodosum leprosum, an immune-mediated complication usually following drug therapy that is associated with inflammatory skin nodules, fever, malaise, and inflammation in many organs, leading to iritis, arthritis, neuritis, and lymphadenitis (83). A Th1 proinflammatory cytokine profile then predominates (83), characterized by a high TNF response that can be therapeutically reduced by treatment with thalidomide. The strong LD between the gene (TNF) encoding TNF, the gene (LTA) encoding lymphotoxin-α, the MICA and MICB genes, and HLA class I genes makes it difficult to interpret the many associations made between polymorphisms at these genes and different forms of leprosy (see Table S4 in the supplemental material), and there are variable reports as to whether these associations are independent (4, 165) or not (10, 145) from associations with each other or with class II loci. Among these studies, strong support has been found in Vietnamese and Brazilian studies for a direct functional role for polymorphisms at LTA in early-onset leprosy (4, 11). In summary, although a strong case can be made for the importance of classical and nonclassical MHC genes, a much larger study, such as that undertaken for T1D (130), is required to definitively characterize the contribution of polymorphisms at MHC genes to the heritable risk for different clinical outcomes in leprosy.
Tuberculosis is caused by Mycobacterium tuberculosis, which normally enters the body via the mucosal surface of the lung. Pulmonary disease is the most common clinical outcome, but disseminated or miliary tuberculosis and tuberculosis meningitis are also significant clinical problems, with high mortality. CD4+ T cells are important in mounting a protective immune response following primary infection and also in avoiding reactivation of chronic infection. The main effectors are Th1 CD4+ T cells producing gamma interferon, especially in the early stages of infection. Functional studies using knockout mice (43, 57), as well as studies of human families with rare genetic lesions at genes encoding gamma interferon or its receptor (10, 82, 132), demonstrate the crucial role of this cytokine in defense against tuberculosis and atypical mycobacteria. CD8+ T cells, γδ T cells, NK cells, and CD1-restricted T cells have all been shown to be important gamma interferon producers in protection against M. tuberculosis (94). In mice, deficiency in MHC class II impairs the response to acute infection, while deficiency in MHC class I has less influence in the acute phase of infection but is crucial during chronic infection (95, 144, 153). A strong role for MHC class II molecules is again borne out by the larger number of studies demonstrating association with polymorphisms at HLA-DR and -DQ compared to classical class I molecules (Table (Table1;1; see Table S5 in the supplemental material). Fewer studies support a role for TNF/LTA class III loci in tuberculosis compared to leprosy. The results of large-scale GWASs for tuberculosis are pending, and it will be of interest to see whether HLA associations are sufficiently strong to attain significance in that context. The likelihood is that still larger studies, with improved definition of clinical phenotype, will be required to tease out the importance of HLA and non-HLA genes in determining heritable risk for M. tuberculosis-related diseases.
The association between polymorphism at the hemoglobin S (HbS) locus and susceptibility to severe clinical forms of malaria provides the archetypal example of the selective force of infection in human evolution (7-9, 135, 175). Malaria is a complex disease, with different clinical outcomes, and a large number of small genetic effects have been documented (see, for example, web Table 3 in reference 27). Based on the clinical observation of extremely highly levels of TNF in cerebral malaria (116), many studies have examined associations between malaria and polymorphisms at the class III region TNF locus (Table (Table1;1; see Table S6 in the supplemental material). The first large-scale GWAS was recently undertaken by the WTCCC and MalariaGEN Consortium using the 500K Affymetrix SNP chip to compare 1,000 patients with severe malaria from The Gambia against 1,500 ethnically matched controls (175b). The importance of HbS polymorphism stood out as the most significant association observed, with P = 3.9 × 10−7 by allelic trend test and an OR of 0.09 (95% CI = 0.05 to 0.16; P = 1.3 × 10−28) for heterozygous advantage. Against this, SNPs in the chromosome 6p21/MHC region were not among the top 18 associated regions observed at a P value of <10−5. Examining the data specifically for MHC associations, SNPs at HLA-B showed significance at a P value of 0.002 and SNPs at TNF/LTA did so at a P value of 0.038. One caveat in interpreting these data is the uneven distribution and unequal coverage of the genome for SNPs on the chip. Hence, it would be expected that higher P values could be achieved if the functional SNP itself were present on the chip, as indeed was the case for the functional variant rs334 for the HbS gene (P = 10−28) compared to that discovered on the GWAS chip (P = 3.9 × 10−7), where there was no marker in high LD with rs334 (175b). Similarly, much smaller P values have been achieved in previous studies of the TNF locus in the same Gambian population (91, 116). Nevertheless, the results of this GWAS using a more powerful sample suggest that MHC associations are not major players in determining direct heritable risk for the severe clinical malaria phenotype defined for this study. Given the complexity of the disease, larger samples and more detailed investigation of clinical phenotypes may reveal more subtle effects. For example, TNF haplotypes have recently been shown to be associated with iron deficiency anemia in West Africa (16). TNF is thought to inhibit intestinal iron absorption and macrophage iron release. In a cohort of 780 children, the prevalence of iron deficiency anemia increased over the malaria season (P < 0.0001). Homozygosity for the minor allele at bp −308 (now redefined as bp −307 ) of the TNF gene, previously found to correlate with high TNF levels, was associated with an increased risk of iron deficiency (OR, 8.1; P = 0.001) and iron deficiency anemia (OR, 5.1; P = 0.01) at the end of a malaria season. No genotypes were associated with iron deficiency anemia prior to the malaria season. The authors concluded that TNF is a risk factor for iron deficiency and iron deficiency anemia in children in an environment where malaria is endemic and that this could be due to a TNF-induced block in iron absorption.
Different species of Leishmania cause a spectrum of clinical diseases in humans, including localized, mucosal, diffuse, and disseminated forms of cutaneous disease, as well as visceral leishmaniasis that can be followed posttreatment by post-Kala-azar dermal leishmaniasis. Early studies of Leishmania donovani in mice showed dramatic differences in visceral disease in livers and spleens in congenic mice with different H-2 haplotypes (21). Genetic analysis using recombinant congenic mice (22) and functional analysis blocking IA or IE (corresponding to DQ and DR, respectively) molecules in vivo with monoclonal antibodies (23) showed that noncuring and curing responses mapped to the class II molecules. Mice carrying specific IA alleles in the absence of an IE molecule were able to self-cure, while the presence of IE alleles was associated with a noncuring response (86). A role for class II molecules is consistent with the requirement for presentation of antigen to CD4+ cells that produce gamma interferon to mediate self-curing infections, although the dichotomy between Th1/Th2 responses observed for L. major infection in mice (102) does not apply to visceral leishmaniasis caused by L. donovani (87). An important role for CD8+ T cells has also been demonstrated in L. donovani infection in mice (138, 154). Hence, there is strong a priori support for the candidacy of polymorphism at HLA class II and class I in determining susceptibility to human visceral leishmaniasis. However, four independent GWLSs undertaken in Sudan (25, 120) and Brazil (80, 81) have failed to observe genome-wide significance for linkage at the MHC/HLA region. One case-control study of visceral leishmaniasis carried out in Tunisia demonstrated associations with DR/DQ class II genes but not with TNF/LTA or HSP70 class III loci (118). No association was observed with DR/DQ or TNF/LTA for clinical visceral leishmaniasis in Sudan (26, 127) or Brazil (85, 136), but associations were observed between delayed-type hypersensitivity-positive asymptomatic individuals and TNF alleles (85). A large-scale GWAS for clinical visceral leishmaniasis cases and controls from India, as well as visceral leishmaniasis cases in families from Sudan and Brazil (where the delayed-type hypersensitivity quantitative trait has been measured in all individuals), is currently under way as part of phase 2 of the WTCCC (J.B. is a member of WTCCC2). Data will be available during 2009. A number of small case-control studies have reported positive associations between various clinical phenotypes for cutaneous leishmaniasis and polymorphisms at class I (HLA-A, -B, and -C), class II (DR/DQ), and class III (TNF/LTA) loci (Table (Table1;1; see Table S7 in the supplemental material). The role of TNF polymorphisms in mucosal leishmaniasis (31) is of particular interest given the very high levels of TNF associated with this disease (35) and the effectiveness of treatment with the drug pentoxifylline, which suppresses TNF responses (98).
A number of studies (reviewed in reference 141) have demonstrated that host genetics is an important determinant of susceptibility to human helminth infections. Heritability ranges from 21% to 44% for eggs per gram of feces for Ascaris lumbricoides, Trichuris trichiura, Necator americanus, and Schistosoma mansoni and is 44% and 37% for measures of worm burden and worm biomass, respectively, for A. lumbricoides. Immunologically, helminth infections are associated with polarized CD4+ Th2 responses and with high immunoglobulin E levels. HLA class II molecules are therefore considered functionally important, and the genes encoding them are potential susceptibility loci. Interestingly, however, no linkages at the HLA region were observed in GWLSs using either Ascaris (176) or S. mansoni (109) burden as a clinical phenotype. Results for case-control association studies are equivocal (reviewed in reference 141). The greatest attention has focused on association between HLA and pathologies associated with schistosomiasis, which is examined below in greater detail.
Schistosomiasis, also known as bilharzia, is a snail-transmitted, waterborne parasitic infection. The three major species are Schistosoma mansoni, S. haematobium, and the S. japonicum complex. S. haematobium is found in the Middle East and Africa. S. mansoni occurs in the Arabian peninsula, most African countries north of the equator, Brazil, some Caribbean islands, Suriname, and Venezuela. S. japonicum is endemic in China, Indonesia, and the Philippines. Mortality is associated with the severe consequences of infection, including bladder cancer or renal failure (S. haematobium) and liver fibrosis and portal hypertension (S. mansoni). Morbidity is a more serious issue. Schistosomiasis is a chronic illness that can damage internal organs and, in children, impair growth and cognitive development. For example, a survey of disease-specific mortality in 682 million people (http://www.who.int/vaccine_research/diseases/soa_parasitic/en/index5.html) in sub-Saharan Africa reported that 70 million individuals infected with S. haematobium had experienced hematuria, 32 million had experienced dysuria, 18 million suffered bladder wall pathology, and 10 million had hydronephrosis. Of those infected with S. mansoni, the parasite was estimated to cause diarrhea in 0.78 million individuals, blood in stool in 4.4 million, and hepatomegaly in 8.5 million. Severe pathology results from the immune response to eggs deposited in host tissue. Importantly, severe disease such as periportal fibrosis in schistosomiasis caused by S. mansoni develops in only a proportion of infected individuals (37), suggesting differences between individuals that could be influenced by host genotype. Immunologically, severe fibrosis is associated with increased levels of the proinflammatory cytokine TNF and possibly gamma interferon. A number of studies (Table (Table1;1; see Table S8 in the supplemental material) have looked for associations with HLA class I and class II gene polymorphisms and various clinical phenotypes (hepatosplenomegaly, fibrosis, and cirrhosis) that reflect schistosomiasis pathology. Associations between class I polymorphisms have been observed in a number of Egyptian studies but not in one study from Brazil (see Table S8 in the supplemental material). A majority of studies also report associations between hepatic disease and both class I and class II genes in China (Table (Table1;1; see Table S8 in the supplemental material). These associations suggest that T-cell-mediated pathology is important in the pathology associated with schistosomiasis, although the usual caveat of the close LD between the class I, class II, and other immune response genes in the MHC applies. In this respect it is of interest that studies in Sudan failed to find an association between SNPs in the TNF locus and hepatic fibrosis associated with S. mansoni infection (see Table S8 in the supplemental material). As has been the case throughout this review, larger studies with more comprehensive coverage of the MHC region are required before firm conclusions can be drawn about the role of HLA polymorphisms in determining the heritable risk of severe pathology associated with schistosomiasis.
Although much larger studies will be required to validate statistical associations between polymorphisms at HLA class I and class II genes and susceptibility to infectious diseases, there are other ways in which our knowledge of the function of these molecules in relation to triggering appropriate immune responses can be translated into prophylactic or therapeutic benefits. This includes the areas of vaccine design, immune response to crude or recombinant vaccines, and response to drug therapy.
Genetic variants that are associated with resistance to infection or with less severe clinical phenotypes are potentially powerful tools for the design of vaccines, especially when the pathogen epitope(s) that is inducing protective responses is unknown. Defining protective HLA allelic associations potentially allows the identification of pathogen epitopes that are restricted by the specific HLA alleles. These epitopes may then be incorporated into vaccine design in the expectation that the natural resistance can be replicated by immunization, especially if a key mechanism of immunological protection is an HLA-restricted CTL response. This experimental approach, termed “reverse immunogenetics,” has been attempted for a number of infectious diseases where protective HLA associations have been described (46).
In Plasmodium falciparum malaria, HLA class I and II variants that are common in West Africa but relatively rare in other populations have been associated with severe clinical disease (67). The class I protective association between HLA-B53 and severe manifestations of malaria has been investigated by sequencing parasite-derived peptides eluted from the specific HLA molecule. In those resistant to severe malaria, HLA-B53-restricted CTL recognized a conserved peptide from a specific Plasmodium falciparum liver stage antigen-1. Further studies identified malaria CTL epitopes derived from a number of malaria proteins restricted by other class I HLA antigens that are found in many African and non-African populations (3). These data highlighted the diversity of P. falciparum antigens that may be recognized by CD8+ T cells and suggested that several antigens should be included in CTL-inducing malaria vaccines (3). Subsequent malaria vaccines, based on the reverse immunogenetics approach and using DNA priming and recombinant modified vaccinia virus Ankara to boost T-cell responses, have been developed (68). Although these vaccines are immunogenic and partially protective in adults (53, 54), clinical trial results have been largely disappointing to date (129).
An analogous approach, using HLA-based associations to define the immunodominant epitopes to which protective CTLs would respond, is being directed toward infections for which there is an urgent need for effective vaccines—HIV (173), tuberculosis (36), and hepatitis C (77)—but the results so far are also disappointing. For example, in a recent trial of a CTL-based HIV vaccine, infection rates were actually increased in partially vaccinated individuals compared to those receiving placebo (76). It has been suggested that this might reflect the importance of occult infection at mucosal surfaces, where HIV is transmitted and where CTLs are relatively scarce (168).
Although the reverse immunogenetic approach is a logical and attractive translational development of HLA-infectious disease association data, the relative failure of this approach to yield a clinically effective vaccine thus far highlights the complexity and dynamic nature of both pathogen and host genetic variation. Given the extent of HLA polymorphism and the genetic variation in many pathogens (itself partly driven by HLA restriction, at least in HIV), it may not be possible in the near future to develop vaccines that carry sufficiently diverse epitopes to induce protective responses in a sufficient proportion of the population at risk. However, conceptual and technological advances, for example, compressing frequent immune targets of the cellular immune response into a shorter immunogen sequence (177) and vaccinating individuals only with epitopes against which they mount a CTL response (“personalized peptide vaccination”) (178), suggest that further developments in vaccine design are likely in this rapidly evolving field.
The widespread introduction of immunization against infectious diseases is one of the most significant public health interventions. Effective vaccines are those that protect against the clinical infection, usually by the production of functionally effective antibody, and that induce long-lasting immunological memory. There is marked variation in the response to vaccination, in terms of absolute efficacy (protection from infection versus vaccine failure), relative antibody levels, antibody avidity, and the duration of immunological memory.
The differential responses following immunization have a significant genetic basis. In The Gambia, West Africa, twin studies indicate that both HLA and non-HLA loci are responsible for the substantial genetic component in vaccine responses early in life (131), with environmental factors important in determining the subsequent development of antibody persistence, at least against tetanus toxoid (107). Twin studies from other populations demonstrate that the heritability of antibody responses is significant, but the degree of variation that can be attributed to genetic determinants varies with different vaccines (97, 158). The frequency with which immunization fails to result in a protective response also depends on the specific vaccine. Vaccine failure is rare following administration of tetanus toxoid but occurs in 2 to 10% of those receiving measles vaccine and in 5 to 20% following hepatitis B vaccination (90).
The development of a protective immune response following vaccination depends on both initial innate responses (e.g., propagation of live vaccines in host cells and antigen recognition) and adaptive responses (e.g., antigen uptake and presentation or regulation of lymphocyte function) (90). The majority of clinically important vaccine responses arise from the presentation of vaccine epitopes, predominantly by HLA class II molecules, to CD4+ T cells, which differentiate into Th1 and/or Th2 phenotypes. The Th2 CD4+ cells, characterized by production of IL-4, IL-5, IL-10, and IL-13, promote antibody production. Th2 cells also induce proliferation and differentiation of B cells into plasma cells, which are capable of producing immunoglobulin M. In addition, activated T and B cells may differentiate into memory cells.
The central role of HLA class II in the presentation of vaccine epitopes is reflected in the large body of data that associates HLA class II polymorphisms with specific vaccine responses, usually defined by serum vaccine-specific antibody levels. These data are likely to be of greatest clinical relevance for vaccines that show considerable interindividual variation in protective efficacy, as they may identify vaccine epitopes that are most likely to result in protective responses and aid the development of more protective vaccines. This is most clearly illustrated by responses to hepatitis B and measles immunization, although the strength of the associations is often modest, as expected for a complex trait. Findings from The Gambia highlight that the relative contribution of HLA to vaccine responses may vary with age (131), which has (largely unexplored) implications for both the design and interpretation of genetic association studies of vaccine responses (90).
The antibody response to the hepatitis B virus envelope protein vaccine is insufficient to provide protection against infection in up to 10% of individuals. Antibody levels in response to hepatitis B vaccine are highly heritable in both children (131) and adults (70), and family studies also indicate a genetic basis to nonresponsiveness (92). A number of studies have demonstrated significant associations between HLA class II alleles, particularly of DRB1, and the response to primary hepatitis B vaccination (172). HLA associations are also reported for the response to revaccination later in life, when environmental factors such as substance abuse (e.g., cigarette smoke and betel nut) may modify the genetic effects (101). Given the high levels of LD across the HLA class II region, it is difficult to tease out the precise genetic association, and there is evidence that non-HLA MHC genes, such as C4, may be important in hepatitis B vaccine responsiveness (70).
Studies of vaccine failure following measles vaccination illustrate the potential clinical relevance of HLA. Primary vaccine failure (that is, failure to produce antibody following vaccination) occurred in approximately 10% of children during a large measles outbreak in the United States, whereas secondary vaccine failure (susceptibility to wild-type infection despite a previous adequate response to the vaccine) was much less common (78). Measles vaccine failures clustered in families, and subsequent twin studies showed considerable heritability for primary measles vaccine failure. HLA class II associations with both high and low measles antibody responses have been described, both for measles vaccine given alone (79, 133) and for measles vaccine administered as part of the combined measles-mumps-rubella vaccine (134). Interestingly, antibody responses to measles vaccine show a significant heterozygote advantage (157), a phenomenon described in infections such as HIV, tuberculosis, and hepatitis B (29). The HLA associations with measles vaccine responses, including heterozygote advantage, are lost when two doses of the vaccine are given, presumably because there is sufficient antigenic stimulus to overcome genetic determinants, although heterozygote advantage for mumps vaccine responses may persist (156).
Overall there are relatively convincing data linking HLA with responses to vaccines, especially for hepatitis B and measles vaccines. Other, non-HLA genes are also important, and there is increasing interest in innate immune genes, such as those encoding factors involved in Toll-like receptor pathways (50), which may lead to the development of novel adjuvants that improve vaccine responses in high-risk groups, such as newborns (99, 180).
The commonly used anti-HIV drug abacavir, a nucleoside reverse transcriptase inhibitor, is highly effective as part of combination therapy. However, ~5% of those receiving abacavir develop a reversible hypersensitivity reaction (typically a combination of malaise, fever, rash, and gastrointestinal symptoms) that is more severe if abacavir therapy is restarted and may be life-threatening (66). Abacavir hypersensitivity is less common in certain ethnic groups and may occur in familial clusters (45). Detailed analysis of the MHC in the abacavir hypersensitivity reaction revealed a highly significant association with alleles carried on the MHC 57.1 ancestral haplotype (105). This association has subsequently been mapped to the HLA-B*5701 allele. The strength of the association is such that screening for this allele prior to commencing abacavir therapy has very high sensitivity and specificity, is cost-effective, and is applicable across different ethnic groups (106, 146). HLA-B*5701 screening has now become part of standard HIV clinical care in resource-rich countries and probably represents the most convincing translational success of genetic association data in infectious diseases (146, 147). Similar but less dramatic associations with adverse reactions to other antiretrovirals have also been described, but the modest strength of the associations has not resulted in changes to clinical practice (110).
This review has reexamined studies that have looked for associations between classical HLA class I and class II molecules and susceptibility to infectious diseases, with the goal to relate these to immune processes. While Tables S1 to S8 in the supplemental material provide a comprehensive list of results obtained for a subset of globally important diseases, these results need to be interpreted with caution. One crucial flaw is that few studies have looked comprehensively at polymorphisms across the MHC within any single disease in any single ethnically defined population. This is a problem because genes encoding the classical class I and class II molecules occur within a cluster of highly correlated MHC genes with a broader range of functions in innate and acquired immunity. Usually a plausible case can be made for the candidacy of a large number of the ~160 protein-coding MHC genes in determining susceptibility to any complex infectious disease. Selectively genotyping a favorite candidate gene within the cluster and concluding that a positive association relates to that gene is foolhardy and unreliable. This is especially true since the sample sizes employed in most studies carried out to date have been underpowered to detect a single main effect, not to mention more complex models in which multiple MHC genes might be expected to contribute to disease. Probably the most crucial methodological issue in taking the field forward is sample size and power, in conjunction with better definition of clinical phenotypes. Large consortia are needed to agree upon clinical classifications and achieve the sample sizes required to provide definitive data for HLA/MHC loci. If a large GWAS, including minimally 1,000 cases and 1,000 controls, achieves a significant association in this region, then more in-depth analysis is warranted. This could be followed by SNP chip analysis of a larger number of SNPs across the region using commercially available MHC SNP chips, followed by (or in conjunction with) sequence-based genotyping of the classical HLA class I and II genes themselves. Notwithstanding the fact that the currently available 500,000 to 600,000 genome-wide SNP chips do not currently provide the best cover for functional SNPs within the MHC, it is almost certainly not worthwhile undertaking in-depth analysis of MHC SNP chips unless a significant association is seen in a large GWAS. Even when a major role for HLA genes is demonstrated, such as, for example, in T1D (130), it is clear that a very large study and sophisticated analysis are required to tease out the influence of specific HLA alleles.
Another methodological issue is that of selection of study population and sampling strategy. Most candidate gene studies use a case-control approach, and unrecognized ethnic differences between the groups are likely to be reflected in spurious genetic associations that are unrelated to the disease of interest. For example, there is probably a strong association between the HLA region and chopstick use in Sydney. This does not imply that the HLA region necessarily influences manual dexterity, but more likely that the HLA type identifies an Asian subpopulation. This issue of “population stratification” is likely to be more confounding when small genetic effects are sought in large populations (108). It can be minimized if cases and controls are carefully matched, and its presence can also be quantified from the genetic data (especially genome-wide data) themselves (59), allowing stratified analyses if the sample size permits. A family-based approach, for example, using patient-parent trios, provides an alternative strategy to control for mixed ethnicity in a particular study population.
In conclusion, many studies have demonstrated statistical associations between HLA class I and II molecules and susceptibility to a range of complex infectious diseases. Most of these studies are inconclusive. The major work that lies ahead is to provide conclusive genetic and functional data to support a role for HLA in infectious disease susceptibility. The tools are available, and the costs are coming down. The decade ahead should see a complete reevaluation of the true impact of HLA/MHC genes on susceptibility to infectious diseases.
Data for this review were identified by searches of PubMed, Medline, Current Contents, and references from relevant articles; numerous articles were identified through searches of the extensive files of the authors. Search terms included combinations of “susceptibility,” “genetic susceptibility,” “HLA,” “MHC,” “polymorphism,” “haplotype,” “genetic association,” “genetic linkage,” “infection,” and “infectious diseases,” as well as terms for specific infections (e.g., “malaria”) and genes and gene products (e.g., HLA-DR, TNF, LTA, MICA, MICB, C4, C4B, and HSP70) (for example, PUBMED Search Term = HIV AND polymorphism NOT drug; Field: Text Word, Limits: Humans). No date or language restrictions were set in these searches.
J.M.B. acknowledges long-term support from the Wellcome Trust to carry out studies of genetic susceptibility to infectious disease.
The authors have no conflicts of interest to declare.
Jenefer M. Blackwell (professor; B.Sc. First Class Honors Zoology , Ph.D. Population Genetics , University of Western Australia) has a long-standing interest in genetic epidemiology of infectious disease. She transitioned from postdoc to Reader at the London School of Hygiene and Tropical Medicine (1975 to 1991) before being recruited to the Glaxo Chair in Molecular Parasitology at Cambridge University. She raised funds to build, and was Founding Director of, the Cambridge Institute for Medical Research (CIMR). She returned to Western Australia in 2007 to establish the Division of Genetics and Health, Telethon Institute for Child Health Research (TICHR). She remains an Affiliated Principal Investigator at CIMR, and is a Principal Investigator on the Wellcome Trust Case Control Consortium to undertake genome-wide studies of visceral leishmaniasis from Brazil, India, and the Sudan. At TICHR, she works with Drs. Jamieson and Burgner to develop a resource for genome-wide association studies of otitis media in Western Australian children.
David Burgner (associate professor; B.Sc. [Hons.] , M.B. Ch.B. , University of Bristol; D.T.M.&H. , Liverpool University; M.R.C.P. , M.R.C.P.C.H. , F.R.A.C.P. , Ph.D. , Oxford University) has trained in pediatric infectious diseases in the United Kingdom and Australia. He has a long-standing interest in susceptibility to childhood and neonatal infection and is based at the School of Paediatrics and Child Health, University of Western Australia. He leads genomic and epidemiological studies of Kawasaki disease, a pediatric vasculitis with an unknown infectious trigger, including the first genome-wide association study of an infectious disease, and he has been instrumental in establishing a global International Kawasaki Disease Genetics Consortium. He is also involved in studies of innate immunity and infectious disease susceptibility in newborn infants and, with the coauthors of this review, in genomic studies of otitis media, one of the commonest infections of childhood.
Sarra E. Jamieson (B.Sc. [Hons.] Genetics , University of Leeds; M.Sc. Medical Genetics , University of Newcastle; Ph.D. Human Genetics , University of Cambridge) has an ongoing interest in genetic epidemiology of infectious disease susceptibility. During postdoctoral work in the Genetics and Infection Laboratory at the Cambridge Institute for Medical Research, she undertook research into human genetic susceptibility to various infectious diseases, including tuberculosis, leprosy, and congenital toxoplasmosis. In 2007, she moved to the Telethon Institute for Child Health Research, where she is currently a Research Fellow within the Division of Genetics and Health and an Adjunct Lecturer at the University of Western Australia. Her research interests focus on the genetic and epigenetic mechanisms underlying common childhood infectious diseases, including congenital toxoplasmosis and, with Professors Blackwell and Burgner, on recurrent acute otitis media.
†Supplemental material for this article is available at http://cmr.asm.org/.