|Home | About | Journals | Submit | Contact Us | Français|
A genetic basis for atopic dermatitis (AD) has long-been recognized. Historic documents allude to family history of disease as a risk factor. Prior to characterization of the human genome, heritability studies combined with family-based linkage studies supported the definition of AD as a complex trait, in that interactions between genes and environmental factors and the interplay between multiple genes contribute to disease manifestation. A summary of over 100 published reports on genetic association studies through mid- 2009 implicates 81 genes, 46 of which have demonstrated at least one positive association with AD. Of these, the gene encoding filaggrin (FLG) has been most consistently replicated. Most candidate gene studies to date have focused on adaptive and innate immune response genes, but there is increasing interest in skin barrier dysfunction genes. This review examines the methods that have been used to identify susceptibility genes for AD, and how the underlying pathology of this disease has been used to select candidate genes. Current challenges and the potential impact of new technologies are discussed.
Over 2,500 years ago, Hippocrates described a condition of an undetermined cause characterized as “itching over [a patient’s] whole body”1. During the 18th and 19th centuries, what was believed to be this non-descript pruritic state was termed ‘eczema’, or “prurigo diathésique”2. The condition received increasing attention in the dermatologic literature in the late 19th and early 20th centuries, especially as a component of the allergic diathesis, as Coca noted that “every recognized category of allergic disease affects the skin”, suggesting that, “among the diseases of the skin the familial allergy (hay fever-asthma group) is represented by atopic eczema”3. The term was ultimately replaced with the descriptive term ‘atopic dermatitis’ by Coca, Sulzberger and Wise3.
A common theme from the most ancient descriptions of this syndrome to more recent observations is that allergy in general, and atopic dermatitis (AD) in particular, has a profound capacity to run in families. That is, AD, even more so than other atopic disorders, is highly ‘heritable’. For example, in the Munich Asthma and Allergy Study, it was demonstrated that the risk of a child developing AD if one or both parents have AD is higher (OR=3.4, 2.6–4.4) compared to the risk if one or both parents have asthma (OR=1.5, 1.0–2.2) or allergic rhinitis (OR=1.4, 1.1–1.8)4 supporting the notion that, although generic, ‘atopy’ genes may be responsible for manifestation of AD, there are likely to be phenotype-specific genes as well. Early observations of a strong family history of allergy, eczema and asthma5, 6 contributed not only to an understanding of at least one of the underlying ‘causes’ of AD – that is, an allergic response to “environmental inhalant factors” in addition to “hereditary factors”5 - but also promoted the next stage of heritability investigations. A number of twin studies suggested wide ranges of concordance rates of between 0.23–0.86 for monozygotic twins and 0.15-0.5 for dizygotic twins7–11. These wide ranges can probably be explained in part by heterogeneity of the phenotype and phenotype definition across studies, but, as suggested by the relatively low concordance in some studies, even for identical twins, the environment is also influencing disease risk and manifestation. Parent-of-origin effects, or maternal heritability, has also been attributed to AD12, an observation which has subsequently been supported by genetic association studies (i.e., maternally transmitted alleles in the SPINK5 gene13).,
Given the long-standing recognition of the role of heritability in the manifestation of AD and the characteristic early age of disease onset, AD is a trait highly amenable to the genomewide linkage approach for identifying novel genes. The genomewide linkage method is a family-based approach relying on a collection of affected individuals (probands) and their parents (i.e., case-parent trio design) or affected sibling pairs and their parents (and often additional family members), for which the inheritance pattern of a trait is compared with the inheritance pattern of chromosomal regions using highly polymorphic, genetic (‘microsatellite’) markers evenly spaced across all chromosomes (Figure 1, panel A). There are several advantages of genomewide linkage mapping, namely that it is ‘hypothesis-independent’, because the entire genome is scanned without regard to specific candidates. Because of the polymorphic nature of microsatellite markers, genomewide linkage mapping is cost-effective because it requires a relatively small number of markers (~350); as a result, a significant linkage peak requires a much lower correction of the p value (e.g., a lod score of 3.6, or P=2×10−5 14) compared to study designs involving thousands or tens of thousands of markers. Because linkage cannot detect genes with minimal or modest disease effect, linkage peaks that reach statistical significance are generally indicative of a locus (loci) with substantial effect on disease risk.
To date, there have been five genomewide linkage studies performed on AD, plus a genomewide linkage screen originally designed for asthma with analyses repeated for the AD outcome (Figure 2). All but one of these screens were performed on families of European ancestry: (i) 199 German and Scandinavian15; (ii) 148 British16; (iii) 109 Swedish17; (iv) 100 Danish18; and (v) 295 French19 families, of which 62 affected sib-pairs for AD were available for re-analysis. The non-European study was performed on 77 Japanese families selected through 111 sib-pairs with AD (287 individuals) and relied on a linkage mapping panel of 5,861 single nucleotide polymorphisms (SNPs) rather than a microsatellite panel traditionally used for linkage screens20.
Underscoring the heterogeneity commonly observed in complex diseases, which, in addition to genetic heterogeneity, reflects heterogeneity of non-genetic factors, including differences in family ascertainment schemes, definition of the phenotype, and analytical approaches, there has been limited overlap of signals among these genomewide linkage studies. Only the 3p24 locus has truly replicated, with significant LOD scores observed for microsatellite markers in chromosome 3p24-p22 in the Swedish17 families and chromosome 3p26-p24 in the Danish18 families. Under a more relaxed threshold of a maximum distance of 25 centimorgans (cM) between linkage peaks, replication can be considered for the chromosome 3q13–q21 locus in the German/Scandinavian15 and Swedish17 samples, and chromosome 18q11–q21 in the Danish18 and Swedish17 samples.
In the French study, the linkage screen was originally designed to study asthma and allergic rhinitis in a sample of 295 families ascertained through asthmatic probands, but analyses were repeated for the outcome ‘eczema’, and demonstrated linkage at 5q13 and 11p14. Follow-up fine-mapping of eight markers in the 11p14 locus suggested a pleiotropic effect for the three allergic diseases, AD, asthma and allergic rhinitis. More recently, the Danish team followed up on their previous evidence for linkage at the 3p, 4q and 18q loci, in addition to previous evidence for linkage at 3q, in an independent sample of 130 AD sib-pair families, and concluded the strongest evidence for linkage was to 3p34, 3q21, and 4q2221. From these follow-ups, however, conclusions are speculative at best, given that each of the regions where significant evidence of linkage has been identified contains multiple candidate genes. The best evidence for linkage in any one of these regions tends to extend over relatively large portions of the chromosome, rendering pinpointing of any specific locus – or gene – very difficult.
Although the genomewide linkage approach represented one of the most sophisticated technologies in genetic epidemiology just over a decade ago, inherent challenges in this approach including the considerable cost to follow-up genotyping of unwieldy and large chromosomal loci and the difficulties ascertaining sufficient numbers of complete families has been, in part, an impetus for pursuing alternative approaches in gene hunting. After an initial genomewide linkage analysis, positional cloning usually follows, whereby extra microsatellite markers at a density of 0.5–1.5 cM are genotyped over the linkage peaks that are suggestive or significant in the initial scan, until the precise locus contributing to linkage is identified. However, even at this level of fine mapping, with an aim of localizing a gene to a region <1 cM (1 cM.1 million base pairs), the region of peak linkage score may still include hundreds of genes. In none of the AD genomewide linkage studies illustrated in Figure 1 was a candidate gene identified using positional cloning.
With the publication of initial efforts in sequencing the human genome22 (http://www.ornl.gov/sci/techresources/Human_Genome/home.shtml), the opportunity to genotype markers directly in genes of interest was greatly expanded as polymorphisms were identified in the ~20,000–25,000 genes across the 3 billion chemical base pairs that make up human DNA. Relying upon one of the simplest of these polymorphisms, single nucleotide polymorphisms (SNPs), and relatively simple structural variants, such as insertions/deletions and repeats, this advancement allowed researchers to expand genetic studies beyond linkage toward the genetic association study design (Figure 1, panel B). Because many more biallelic markers are required compared to microsatellites to detect linkage and association across the genome, a candidate gene approach was adopted whereby investigators have focused on a specific gene or set of genes believed to be causally involved in the underlying pathology of a certain disease. An advantage of the candidate gene approach is that it is not limited to families, and can be applied to case-control study designs, which possess certain advantages over family-based studies. For example, although association studies based on case-control designs are sensitive to confounding due to population stratification (i.e., ethnic/racial admixture)23, they are generally considered to be more powerful in detecting true associations once the gene has been identified24. Moreover, it is considerably easier (requires less effort, lower costs in recruitment) to amass reasonably powered groups of AD patients and healthy controls without AD than it is to collect complete case-parent trios or nuclear families.
For these reasons, there has been considerable effort in conducting association studies on AD and related phenotypes in independent populations in the post-genomewide linkage era. In a search of the public database (http://www.ncbi.nlm.nih.gov/pubmed/) through June 2009, using keywords including ‘association’, ‘atopic dermatitis’, ‘eczema’, ‘gene’, ‘polymorphism’, ‘mutation’, and ‘variant’, 111 studies were identified for which results of tests for association for AD were reported on a candidate gene (Table E1 in this article’s Online Repository at www.jacionline.org). The major outcome was limited to AD as a qualitative trait, or AD severity. A significant association was defined liberally as any association at P<0.05 and without regard to effect size or weaknesses in the study design. Limitations in this exercise include the sometimes missing information on the precise variant for which association was (or was not) demonstrated, and the unfortunate reality that negative association studies are rarely published and are therefore underrepresented in Table E1. In most instances, the only information available on genes for which association was sought but for which the result was negative are those studies for which the negative results are included in manuscripts reporting positively associated genes/variants. In other words, there are undoubtedly more genes for which associations have been tested but failed but for which the information is not available in the public arena.
It is critical to note that there are several biases in this summary. First, there is the bias of reporting associations in studies for which either Type I error (false positive) or Type II error (false negative) is possible (and in many cases, likely) due to limited power in the original study design. Other biases are related to potential study design weaknesses, such as a failure to adjust for population stratification in the case-control studies and consideration of Hardy-Weinberg proportions. There is a general assumption that, for risk variants with common allele frequencies of greater than 20%, the odds ratios (ORs) will range from 1.1–1.5, and for rarer risk allele frequencies (i.e., <0.20), the ORs can be ~3.0. Based on these assumptions, there is a requirement of at least 1,000 cases and 1,000 controls to detect ORs~1.5 with at least 80% power, although the required sample size is dependent upon multiple, additional factors. Regardless, according to the summary in Table E1, in barely two dozen of the reported studies have sample sizes reached even 250 in each group. Another important consideration is the extreme heterogeneity of the phenotype definition from study to study, wherein some studies consider early onset AD versus studies that combine pediatric populations with adult populations, variations in means for determining disease severity, and so on. Variation in the phenotype renders replication of associations especially difficult.
A critical bias in this exercise is that replication was considered at the level of the gene rather than the specific variant/mutation. Elsewhere there has been extensive discussion regarding the merit of this approach, including the comprehensive review of asthma genetics by Ober and Hoffjan23, in which the authors refer to advocacy of a gene-centric approach because the more conventional SNP-for-SNP approach is predicated on the assumption that allele frequencies and haplotype structure at a specific locus is identical in two (or more) populations, which is unlikely25. Elsewhere, the pitfalls of this “loose” replication have been cautioned26. However, a major outcome from the International Haplotype Map (HapMap) project27 is the observation that a large portion of the human genome is arranged into blocks of common polymorphisms (SNPs) in strong linkage disequilibrium (LD) with one another, and there is considerable diversity within these haplotype blocks. As much of this diversity is driven by mutation, and given our knowledge that the functional properties of protein products (e.g., candidate genes) can depend on specific combinations of multiple polymorphisms within a gene, or interactions with polymorphisms in other genes28, it is not surprising that a single-locus approach may fail to detect association a priori, much less fail to replicate across studies in different sets of populations. The recent report by Rogers and colleagues, wherein they combined SNP data from the Ober and Hoffjan review, supplemented by their own review of the asthma genetics literature, with SNP data from their genomewide association study (GWAS) on over 1,000 asthmatic children and family members in the CAMP study and >500,000 SNPs, underscores the unlikely success in replication at the SNP level, as the group only identified 10 significantly associated SNPs in six genes that were on the commercial SNP chip and that overlapped with 160 SNPs in 39 genes previously associated with asthma in the literature29.
Finally, another area of potential controversy has to do with common versus rare variants. A perspective on the scientific value of GWAS30 suggested that (assuming currently available commercial chips capture the bulk of common genetic variation) SNPs with large effects may have already been discovered (implying there is no need for additional genome wide scans), and the focus should therefore shift towards detailed search for rare variants. Although the application of both conventional candidate gene studies as well as GWAS has lead to the discovery of hundreds of loci conferring disease risk of common diseases, it has been noted that the risk variants identified by GWAS in particular explain only a small fraction of the disease-specific heritability. One explanation that is beginning to emerge is that much of the remaining heritable disease risk is associated with rare variants31 (usually defined as those with a frequency of less than 5%−1% or lower). For example, rare alleles of at least 10 loci contribute to risk for breast cancer, and more than 40 loci underlie risk of type 1 diabetes. Importantly, in contrast to common variants, rare nonsynonymous variants in the human genome may be deleterious and therefore of significance because they influence protein function and/or phenotypic variation. Most of these variants are probably only mildly deleterious, segregate at low frequencies, and not pathogenic. However, a subset of these variants have more modest effects and are rare, but could collectively explain a substantial fraction of the heritably variability of common disease phenotypes.
While rare coding variants may have a greater functional impact than common variants, their analysis must consider the low frequency of any variant since it will reduce the power to infer statistical associations, and therefore require very large populations of cases and controls or affected families. This complication can be overcome by evaluating the collective frequency of rare, nonsynonymous variants within one or more genes, or for a pathway(s) and the functional impact of the discovered variations. As described further below, however, with the exception of studies focused on mutations in the gene encoding filaggrin (FLG), few if any other studies in AD have specifically focused on rare variants nor have designed studies of sufficient power to be able to detect rare variants.
Despite the biases and potential pitfalls described above, several interesting conclusions can be inferred from this comprehensive review of the literature on the genetics of AD. In the 14 years since the first published report on an association between a genetic variant and AD, more than a third of the studies have been reported in the past two and a half years, and already in the first half of 2009, more reports have been published than in all of 2008. From these 111 published studies, there are reports on 81 genes, of which more than half (46 genes) had at least one positive association study reported (Figure 3). Of these 46 genes, 15 studies failed to replicate associations, and 13 were positively associated in at least one other independent study. One of these genes, the gene encoding filaggrin (FLG), has been associated with AD in 20 different reports (details below). There are 35 additional genes studied for which there has been no evidence for a positive association to date.
The well-established co-morbidities of the other two allergic diseases that constitute the atopic triad (i.e., asthma, allergic rhinitis), in addition to sensitization to common aeroallergens that is commonly observed in patients with AD, has guided researchers toward selection of candidate genes that fall within the broad category of dysregulation of the adaptive and innate immune response and a heightened IgE-mediated, systemic Th2 response, plus a combined Th1 and Th2 response in the skin. This selection bias is evident in Table E1.
For complex diseases such as AD that involve a dense network of immune response proteins, it is anticipated that many genes will be involved and multiple genetic variants will contribute to the alteration of gene function and expression. Because the effect of a single gene/polymorphism in a complex disease such as AD is anticipated to be relatively modest, it can be assumed that variants in multiple genes will cooperate in an additive or synergistic manner to impact disease risk, a phenomenon referred to as ‘epistasis’. A major difficulty in testing for epistasis is power: not only will fewer individuals in a sample possess both polymorphisms (or a set of polymorphisms) associated with risk of disease compared to the number of individuals with only one of the polymorphisms, but the correction factor for additional (multiple) comparisons (i.e., tests for association) is quite large. Consider the example of 100,000 genetic markers, wherein there are a total of 5×10+9 two-locus combinations, requiring a (Bonferroni) correction factor of P=1×10−11 for a GWAS significance level of 0.0532.
One approach toward characterizing potential gene-gene interactions and systematically evaluating the role of candidate genes/polymorphisms in AD susceptibility for which there is compelling evidence for association is to implement the program Ingenuity Pathways Analysis (Ingenuity Systems: www.analysis.ingenuity.com). The Ingenuity Pathways knowledge base is a web-based entry tool developed by Ingenuity Systems Inc. to characterize genes according to the pre-defined canonical pathway(s) into which they fit, and also to investigate the extent to which genes are in shared networks and may cooperate in a synergistic or an additive manner to impact risk of disease. As a proof of concept, we evaluated the 81 genes summarized in Figure 3 using the Ingenuity Pathway Analysis. Slightly more than half (N= 48) of the 81 genes studied to date clustered into two major networks, both of which are associated with immune dysregulation, specifically the pathway associated with antigen presentation and cell-mediated and humoral immune response, and the pathway associated with cell signaling and interaction, cellular movement, and hematological system development and function. Six genes (CD14, GATA3, IL4, IL18, NOD1, TLR2) which have previously been significantly associated with AD were clustered into the antigen presentation and immune response pathway, and nine previously associated genes (BCL2A1, BDNF, CCL5, CSF2, GSTP1, IL5, IL12B, IL12RB1, SOCS3) were clustered into the cell signaling/movement pathway. Although the studies for which these candidate genes were evaluated did not specifically test for gene-gene interaction, this interrogation of the potential for interaction serves as an example of the power of this approach in selecting optimal candidates for genetic association studies.
In addition to immune dysregulation manifested as IgE-mediated sensitization to numerous allergens, AD is also characterized as a common, chronic pruritic, inflammatory skin disease complicated by recurrent bacterial and viral skin infections (i.e., Staphylococcus aureus and herpes simplex virus)33, 34. More seriously, patients with AD are at greater risk for developing severe and generalized viral infections caused by herpes simplex virus (e.g., eczema herpeticum), molluscum contagiosum virus (e.g., eczema molluscatum), and eczema vaccinatum, which occurs after exposure to the smallpox vaccine35.
Increased susceptibility to infections and cutaneous colonization implicates several of the immune function genes listed in panel A of Figure 4, specifically those genes associated with a dysfunctional host defense – or innate immune – response. These candidates include the pattern-recognition receptors (PRRs) type I transmembrane, toll-like receptors (TLRs)36, the nucleotide-binding oligomerization domain (NOD)-leucine rich containing protein family (NOD1, NOD2)37, 38, and CD1439–41. Antimicrobial peptides, including S100 proteins, human defensin-α and –β, and sphingosine exert potent antimicrobial activity by directly killing bacteria, fungi, and certain viruses42. Natural killer (NK) cells, a critically important population of lymphocytes for innate immune responses against virus infection43, are dependent upon transcription factors such as IFN regulatory factor-2 (IRF-2) for efficient cell development (for an extensive review on innate immunity in AD, see Refs. 44 and 45). Genetic association studies on AD to date in fact support a number of these candidates (i.e., TLR2, NOD1, NOD2, CD14, DEFB1; Figure 3). In addition to the direct effect of genetic modifications in innate immune response molecules and their role in susceptibility to infection, the attenuation of upregulation of the normal antimicrobial response to bacterial and viral stimuli because of an overabundance of Th2 cytokines in the skin appears to be especially relevant in AD46, 47. For example, it has been demonstrated that elevated Th2 cytokine levels can inhibit mobilization of potent innate immunity molecules, such as human beta-defensin-3, in epidermal keratinocytes48. Alternatively, persistent S. aureus infections can also mediate inflammatory cascades by staphylococcal toxins acting in a superantigen-driven fashion to activate T cells49 or by induction of a state of glucocorticoid resistance50. This observation further underscores the potential relevance of gene-gene interactions, as illustrated in panel A of Figure 4.
Notably, the candidates described above and summarized in Figure 4 as well as Table E1 are not all unique to AD; in fact, many AD candidate genes overlap with not simply other atopic phenotypes (e.g., asthma, allergic rhinitis), but also other diseases of inflammation and immune dysfunction. The common disease/common variant hypothesis41 has been put forth as one explanation for why many complex diseases are so common and why disease-associated variants occur at such high frequency in the population. The ‘common variant/multiple disease’ hypothesis, which is an extension of the common disease/common variant hypothesis, suggests that certain disease genes may not be disease specific, and may contribute to related clinical phenotypes51. One possibility is that the functional effects of certain alleles manifest in multiple disorders, presumably because they are involved in basic underlying immune regulatory pathways. Multiple examples of genetic association of the same gene (or allele) to diverse but related disorders abound (see the Genetic Association Database at http://geneticassociationdb.nih.gov). For example, in addition to AD, genetic associations have been observed DEFB1 for asthma52, COPD53, and infectious disease including HIV54 and sepsis55. NOD2 polymorphisms, in addition to AD, have also been associated with Crohn’s Disease56 and sarcoidosis57. Associations with RANTES polymorphisms are especially diverse, including phenotypes associated with immune response, infection, reproduction, and metabolic disorders (http://geneticassociationdb.nih.gov/). Given the prominent role that certain pathways, such as host defense, play in susceptibility to AD, it is likely that many more co-associations will be observed in the near future.
It is increasingly appreciated that both genetic and environmental factors that affect skin barrier function contribute to AD susceptibility58, and that barrier dysfunction is an essential feature of AD and allergic diseases in general59–63. A disrupted barrier would allow penetration of microbes and allergens and other environmental insults such as toxins, irritants and pollutants with consequences including inflammation, allergen sensitization and bacterial colonization. This may explain why 55%–90% of patients with AD are colonized with S. aureus compared to only 5% of individuals without AD64–68. Although the epidermis functions as the primary defense to the external environment, considerable barrier function is regulated by the stratum corneum (SC) and, below the SC, tight junctions (TJ), which reside at the level of the stratum granulosum. When the SC is compromised, either by reduced levels of SC lipids29–31, mechanical trauma resulting from extensive scratching that is precipitated by intensive itch (the hallmark of AD), or as a result of genetic defects in SC proteins (i.e., filaggrin), tight junctions are the next line of defense. Currently there is considerable interest in the more than 40 proteins that comprise TJ, which include the claudin family members, occludin, cingulin, tricellin, and the cytoplasmic plaque proteins69 (which bind to actin and myosin70), and their role in human disease in general71, and specifically AD.
As described earlier, linkage screens performed on AD to date have not elucidated specific candidate genes per se, but they have implicated loci harboring clusters of genes associated with skin barrier dysfunction. Specifically, one of the earliest screens indicated linkage at the epidermal differentiation complex (EDC) locus on chromosome 1q2116, which contains a very large and diverse family of genes associated with skin barrier dysfunction, including loricrin (LOR), involucrin (IVL), members of the S100 gene family, a large group of the late cornified envelope gene family, many of the small proline-rich proteins, peptidoglycan recognition proteins (i.e., PGLYRP3, PGLYRP4) and, most notably filaggrin (FLG)72. Linkage has also been reported in 17q2117, where the gene encoding one of the keratins (KRT16) is localized.
Association studies on genes related to the EDC cluster and other barrier dysfunction candidates have, to date, been largely restricted to the gene encoding filaggrin (FLG), also known as filament-aggregating protein, and within FLG, most associations have been limited to two null mutations (R501X and 2282del4). In fact, FLG is the most consistently associated gene with risk of AD73; as illustrated in Figure 3, by mid-2009, there were 20 positive reports on genetic associations between FLG mutations and AD. The gene encoding human filaggrin was first cloned in 1989, when it was found to contain numerous tandem filaggrin repeats, localized to chromosome 1q21, and, because of its tight regulation at the transcriptional level in terminally differentiating epidermis, it was postulated to be an important candidate for disorders of keratinization74. It was subsequently evaluated for its function in the formation of the stratum corneum, and found to be a critical protein involved in epidermal differentiation and in maintaining barrier function75. McLean and colleagues developed long-range PCR conditions for a 12-kb genomic fragment covering exon 3 of FLG, and identified a homozygous nonsense mutation (R501X) near the start of repeat 1 and a second mutation (2282del4) that similarly stops protein translation within the first filaggrin repeat in three patients with ichthyosis vulgaris76. They demonstrated that these relatively rare mutations (a combined allele frequency of ~4% in the European population studied) are semidominant, with heterozygotes exhibiting mild disease with incomplete penetrance. Shortly thereafter, the same group showed that these two loss-of-function variants were associated with AD77, and that they are ancestral European variants carried on conserved haplotypes 78.
Full sequencing of the FLG gene by the same team has revealed multiple, additional polymorphisms with varying frequency across ethnic groups78; however, with a combined allele frequency among AD patients of 18% and 48% for the R501X and 2282del4 mutations, respectively, the two null mutations represent the strongest and most compelling genetic risk factors for AD. In the largest meta-analysis performed so far on the R501X and 2282del4 mutations, Rodriguez and colleagues79 analyzed data from 24 independent studies, which included 6,448 cases, 26,787 controls, and 1,993 families (all selected for AD), and determined that the effect size for risk of eczema due to the two FLG null mutations is not dissimilar to previous reports, at an odds ratio of just over three.
The FLG mutations have also been consistently associated with risk of other atopic traits, including asthma, hay fever, rhinoconjunctivitis, and allergen-specific IgE79–86. Recently, Gao and colleagues (unpublished data) evaluated whether FLG polymorphisms contribute to the serious complication of AD resulting from disseminated cutaneous HSV infections, eczema herpeticum, and determined that the frequency of the R501X mutation was three times higher (24% vs. 8%, respectively) and the relative risk for disease nearly double for eczema herpeticum compared to AD without eczema herpeticum (odds ratio [OR]=11.8 vs. 6.2; P=0.0008). The authors speculate that the relationship between FLG null mutations and disease is most likely related to an increased propensity to disseminated viral skin infection resulting from skin barrier dysfunction, rather than disease severity per se, and provide the first insight into the genetic underpinnings of AD complications such as viral dissemination.
A major limitation in the candidate gene approach is that selection of candidates is often based on limited knowledge87 and, moreover, each potentially causal variant at each candidate gene can only make a modest contribution to overall heritability. Data from the HapMap Project, combined with more accurate approaches in selecting tagging markers sufficiently dense to capture most of the common variation in the human genome (Figure 1, panel C), have recently allowed genomewide association studies (GWAS) to replace candidate gene studies as an unbiased approach to search for genes controlling risk to complex diseases. At the time of this review, results have been published on one GWAS for AD, in which investigators in Germany genotyped a set of AD cases and controls and an independent set of 270 nuclear families on the Affymetrix Human mapping 500K and 5.0 arrays (resulting in 342,303 successfully typed markers)88. Although none of the markers analyzed were significantly associated with AD at the genomewide significance level (P<1.46×10−7), the group genotyped 54 SNPs associated with AD at a modest P-value in an independent German group and observed replication in markers in chromosomal regions 1q21, 9p21 and 11q13, with further replication in an additional European group for a marker (rs7927894) in an intergenic region near C11orf30 on chromosome 11q13.5, which is the same marker that has been associated with Crohn’s Disease89. An additional European-based GWAS is in the replication phase. It will be of interest to determine the extent to which the Esparza-Gordillo et al. findings will be replicated, or not.
Expression profiling of all known genes in the human genome is an ideal strategy for characterizing disease mechanisms and defining the transcriptome of complex diseases such as AD. Indeed, genomewide microarray technology has the potential to identify molecular signatures of clinical disease impossible to identify using a gene-by-gene approach, and there are many examples whereby this technology can be highly predictive of clinical outcome. Although relatively few genomewide expression studies have been performed for AD, which is likely due to the difficulties in ascertaining sufficient numbers of selected cell types of the epidermis and other relevant tissue samples, limitations related to the size of biopsied tissue, and challenges in ensuring collection of representative samples, it is anticipated that integration of this technology into gene discovery for AD will increase, especially as the cost for performing genomewide microarray continues to drop.
Sugiura and colleagues90 performed high-throughput expression profiling of biopsies from skin lesions of AD patients compared to healthy controls, and observed that several of the most significantly differentially expressed genes (i.e., S100A8 and S100A7, upregulated; loricrin and filaggrin, downregulated) were epidermal differentiation genes localized in chromosome 1q21, a region previously linked with AD (note: results were similar in comparisons between affected and unaffected skin among the AD patients). Similar findings have been observed by Beck and colleagues, who compared nonlesional skin of extrinsic AD subjects to the skin of nonatopic healthy controls (personal communication; see Figure 5B). The Sugiura group also observed that keratin 16, localized on chromosome 17q21, another linkage hotspot, was upregulated. In addition to validating genes, high-throughput expression studies have led to novel discoveries. One example is the work by Lu and colleagues91, where they relied upon primary cultured keratinocytes from AD patients and healthy controls to identify a number of novel candidates. Significant findings included two extracellular matrix-associated factors, MMP1 and MMP10, for which ELISA studies of intrinsic AD patients’ serum showed that both proteins were up-regulated nearly two-fold compared to healthy individuals and extrinsic AD subjects. In a differentiated keratinocyte model, filaggrin 2 (or ifapsoriasin), also localized in the EDC, was identified, as well as three novel lipase genes, suggesting that these genes may also play a key role in the skin barrier and are worthy of further study92. Thus, high-throughput expression profiling has already proved useful in supporting the hypothesis that defects in epidermal genes play a critical role in the development of AD, and it is also a tool for validating genetic findings and gene discovery.
In the same year that Watson and Crick reported their structure of the DNA molecule in the journal Nature93, Senior Registrar at the Manchester and Salford Hospital for Skin Diseases, Dr. J.K. Morgan, made the following observation: “In theory, in every case in which there is an eczematous reaction in the skin, there are two prime factors involved. The first, and the more important, is a constitutional or internal factor, relating to the individual himself. The second in an external factor. It is upon the relative balance and interplay between these two that the appellation of the disease is based”94. In the years of heritability studies that followed (described above), the consistent observation that concordance rates for AD among monozygotic twins raised together are higher than among dizygotic twins supported the role of genetic etiology, but the relatively low concordance rates in both groups also supported the earliest suppositions that differences in exposure to certain environmental triggers may account for a considerable proportion of disease expression.
The challenge in the genetic epidemiology of AD in terms of interrogating gene-environment interactions is precisely which environmental factors should be considered. For example, at the core of the innate immune response are the ubiquitous fragments of bacterial LPS, or endotoxin, and many studies focusing on endotoxin exposure early in life suggest a ‘protective’ effect in the development of allergic disease in general95, 96 by skewing the Th profile toward Th1, as purported by the ‘Hygiene Hypothesis’97. Interestingly, there is some support for the role of the Hygiene Hypothesis in susceptibility to AD98. Alternatively, the impact of exogenous substances, such as irritants (i.e., soap and detergents), allergens (i.e., exogenous proteases derived from house dust mites), and drugs (i.e., topical corticosteroids) on AD patients with genetic alterations in skin barrier genes has also been considered58.
Very few of the genetic association studies summarized in Table E1 (Online Repository at www.jacionline.org) have considered evidence of association between genetic polymorphisms in the context of exposure to certain environmental factors. Perhaps the best example is the study by Bisgaard and colleagues99, in which it was hypothesized that a compromised skin barrier among individuals with AD who are filaggrin-deficient may enhance the impact of exposure to certain aeroallergens, such as house dust mite and pet allergen. The group confirmed previous observations that the risk of developing eczema was considerably higher among children with the FLG mutations (hazard ratio 100=2.26, P=0.0005), but that the risk increased considerably if children were exposed to cat allergen at birth (HR=11.11, P<0.0001). Important considerations in future, similar studies will be the ability to detect such interactions for variants with smaller effect sizes, the temporal relationship between environmental exposure and risk of disease (i.e., perinatal exposure versus exposure later in childhood), and the generalizability of such findings across populations.
As described previously, failure to replicate associations between genetic markers and a complex trait such as AD in independent populations can be due to several factors, including chance, misclassification of the phenotype, environmental heterogeneity, inadequate sample sizes, and population stratification101–103. An important factor contributing to failure to replicate associations is also related to population diversity. For example, it is possible that certain genetic markers may contribute to disease risk in one particular (i.e., ethnic, racial) population but not in others, either because of differences in frequencies of the risk allele(s), or because of specific gene-gene interactions. It is difficult to evaluate the impact of ethnicity on genetic associations of AD, however, because there is relatively little diversity in the populations that have been studied thus far. As illustrated in Table E1 (Online Repository at www.jacionline.org), the overwhelming majority (N=70) of association studies have been performed in populations of European descent, followed by 42 studies performed in Asian populations. A single study was performed in a Mexican cohort, and none of the reported studies have been performed in groups considered underserved minorities, such as African Americans. Sadly, this statistic does not reflect the actual prevalence of AD, where African Americans and Asian/Pacific Islanders reportedly suffer more from AD than U.S. whites104.
Perhaps the best example of a candidate gene for which ethnicity likely influences the extent to which a polymorphism confers risk is FLG. Palmer and colleagues77 observed differences in the R501X and 2282del4 FLG null mutation frequencies in diverse cohorts and suggested that different populations will have different FLG mutation profiles. Indeed, this group and others have demonstrated that, in populations where the R501X and 2282del4 mutations are not present, other mutations are prevalent and confer risk of AD, for example, the 3321delA and S2554X mutations among Japanese patients105. In unpublished data, our group observed a complete absence of the R501X mutation and very low frequency (1%) of the 2282del4 mutation among healthy African Americans, compared to a frequency of 9% of the R501X mutation and 0% of the 2282del4 mutation among AD patients (unpublished data). The low frequency and even absence of this mutation is not novel; elsewhere the prevalence of the R501X mutation among individuals without AD has ranged from 0.8% to 3.0% among European populations, and has been found to be absent in southern European (i.e., Italian106) and Asian77, 105 groups. In the only summary data available on frequency of this mutation in African populations, it was absent in a cohort of 124 North Africans77. Collectively, the distribution of these alleles suggests a latitude-dependent distribution with a decreasing north-south gradient of frequency, and suggests that polymorphisms other than the relatively common R501X and 2282del4 mutations may be more important in non-Northern European groups.
To recapitulate, AD is a chronic inflammatory disease of the skin characterized by dysregulation of the adaptive and innate immune response and a heightened IgE-mediated, systemic Th2 response. Extreme Th2 polarization and primary defects in the innate immune response, including epithelial barrier defects, in conjunction with mechanical damage to the epidermis as a consequence of the intense pruritus that is the hallmark feature of AD, likely contribute to the more severe sequelae including chronic bacterial colonization (i.e., S. aureus infection) and viral dissemination (i.e., eczema herpeticum). This scenario can be summarized in Figure 5, whereby the damaged epidermal surface is penetrated by a host of exogenous substances, including allergens, irritants, microbes, pollutants, and even topical drugs. The ‘brick wall-like’ structure of the stratum corneum (SC), which normally creates a barrier that maintains water within the body and prevents the entrance of pathogens and allergens, is further compromised. The epidermal differentiation complex (EDC), the DNA region where a large number of genes encoding many of the cornified cell envelope precursor proteins, small proline-rich proteins, members of the S100 family, and intermediate filament-associated protein precursors (i.e., profilaggrin), are localized, is an important target of candidate genes associated with barrier dysfunction at the level of the SC. The last line of defense is the stratum granulosum, containing TJs, proteins that constitute the “gate” to the passage of water, ions and solutes through the paracellular pathway. Systemically, a dysfunctional immune response resulting in an imbalanced innate and adaptive milieu further aggravates the system. Early genomewide linkage studies, association studies, and high-throughput expression profiling studies have supported the role of skin barrier dysfunction candidate genes in conjunction with innate and adaptive immune response genes. A comprehensive evaluation of all candidate gene studies published to date on AD illustrates the importance of both sets of genes, and a pathway analysis of the genes studied so far supports a more thorough approach toward gene-gene and gene-environment interactions.
There are considerable challenges in the field: expanded analyses of skin barrier dysfunction genes, to include not only those residing in the stratum corneum, but also tight junction (TJ) genes at the level of the stratum granulosum; an expanded focus on rare, in addition to, common variants; expansion of population studies to include more ethnically diverse groups that are adequately powered; and a better integration of higher throughput technology in addition to analyses that consider the interactive effects of common environmental factors. Each of these efforts will require greatly expanded sample sizes of carefully phenotyped patients and rigorous statistical approaches. None of these goals are achievable in the absence of a multidisciplinary approach, which will require the equal contributions of expert clinicians, geneticists, statistical analysts, and molecular biologists.
The author wishes to thank Drs. Li Gao, Peisong Gao, and Candelaria Vergara, Nicholas Rafaels, and Pat Oldewurtel for technical assistance, and Mr. Boyd Jacobson for his artistic contributions, as well as Drs. Donald Leung and Stephan Weidinger for invaluable discussions and comments. A special thanks to Dr. Lisa A. Beck, who shared critical preliminary data and contributed to important discussions. The author gratefully acknowledges the contributions of the Atopic Dermatitis Vaccinia Network (ADVN) in generating much of the data used in this review.
Funding Sources: This study was supported by National Institute of Health (NIAID: HSN266200400029C).
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.