|Home | About | Journals | Submit | Contact Us | Français|
Rationale: Replication of gene-disease associations has become a requirement in complex trait genetics.
Objectives: In studies of childhood asthma from two different ethnic groups, we attempted to replicate associations with five potential asthma susceptibility genes previously identified by positional cloning.
Methods: We analyzed two family-based samples ascertained through an asthmatic proband: 497 European-American children from the Childhood Asthma Management Program and 439 Hispanic children from the Central Valley of Costa Rica. We genotyped 98 linkage disequilibrium–tagging single-nucleotide polymorphisms (SNPs) in five genes: ADAM33, DPP10, GPR154 (HUGO name: NPSR1), HLA-G, and the PHF11 locus (includes genes SETDB2 and RCBTB1). SNPs were tested for association with asthma and two intermediate phenotypes: airway hyperresponsiveness and total serum immunoglobulin E levels.
Measurements and Main Results: Despite differing ancestries, linkage disequilibrium patterns were similar in both cohorts. Of the five evaluated genes, SNP-level replication was found only for GPR154 (NPSR1). In this gene, three SNPs were associated with asthma in both cohorts, although the opposite alleles were associated in either study. Weak evidence for locus-level replication with asthma was found in the PHF11 locus, although there was no overlap in the associated SNP across the two cohorts. No consistent associations were observed for the three other genes.
Conclusions: These results provide some further support for the role of genetic variation in GPR154 (NPSR1) and PHF11 in asthma susceptibility and also highlight the challenges of replicating genetic associations in complex traits such as asthma, even for genes identified by linkage analysis.
Asthma candidate genes have been identified by positional cloning, although these results have not been consistently replicated, even in the initial reports.
This study provides further support for the role of GPR154 in asthma susceptibility and the first independent replication for the PHF11 locus.
Replication of the results of a gene–disease association study in independent samples has emerged as a standard for demonstrating the relevance of a candidate gene for a complex trait (1–3). Although there are now many examples of reproducible associations in which a specific genetic variant is consistently associated with a specific phenotype, these are greatly outnumbered by studies in which true replication is claimed, yet the associations observed do not exactly replicate the initial report, as evidenced by differences in the particular variants or phenotypes studied. Replication of the association with the specific polymorphism is the strongest evidence; however, positive association with other variants at the locus may still point to the importance of the particular candidate gene. Phenotypic heterogeneity (e.g., childhood vs. adult asthma) is an issue in many complex diseases and may be especially problematic in diseases such as asthma that lack an explicit diagnostic test (4). Different population origins and linkage disequilibrium (LD) patterns may be an additional cause of nonreplication.
Asthma is the most common obstructive lung disease, affecting 12% of children in the United States (5) and accounting for substantial morbidity in children and adults worldwide. Family studies strongly support an important genetic contribution to asthma susceptibility (6). Genomewide linkage studies have identified no fewer than 20 loci with evidence suggestive of linkage, although few of these linkages have met stringent criteria for significance, even in well-powered samples (7). These data suggest that asthma is a complex polygenic disease, with multiple genes of modest effect interacting with each other and with environmental factors. Starting with the identification of ADAM metallopeptidase domain 33 (ADAM33) on chromosome 20p by Van Eerdewegh and colleagues (8), six asthma susceptibility genes have been identified by positional cloning, including PHD finger protein 11 (PHF11), a locus on chromosome 13 that includes two additional genes: regulator of chromosome condensation (RCC1) and BTB (POZ) domain–containing protein 1 (RCBTB1) and SET domain, bifurcated 2 (SETDB2) (9), dipeptidyl-peptidase 10 (DPP10, on chromosome 2q) (10), neuropeptide S receptor 1 (NPSR1, previously GPR154 or GPRA, on chromosome 7p) (11), HLA-G histocompatibility antigen, class I, G (HLA-G, on chromosome 6p) (12), and cytoplasmic FMR1-interacting protein 2 (CYFIP2, on chromosome 5q) (13).
Because linkage analyses are typically powered to detect risk loci with large effects, disease genes identified through linkage might be expected to exert a larger effect on the phenotype, and therefore be more amenable to replication. For the positionally cloned asthma genes presented in Table 1, replication at the marker level was not always observed, even in the initial reports. In follow-up studies (Table 2), stringent replication at the level of the single-nucleotide polymorphism (SNP) or haplotype described in the first report has not been common. Given the lack of definitive functional data for these genes, it remains unclear which of these findings are valid. We hypothesized that the genetic associations for positionally cloned asthma genes would not all be consistently replicated across additional populations. To test this hypothesis, we examined the first five of the six reported positional asthma genes (ADAM33, PHF11, DPP10, GPR154 [NPSR1], and HLA-G) in two family-based samples ascertained through an asthmatic proband: a cohort of non-Hispanic white North American children from the Childhood Asthma Management Program (CAMP) and a cohort of Hispanic children from the Central Valley of Costa Rica. Results from this study have been previously reported as an abstract (14).
See the online supplement for detailed methods.
CAMP is a multicenter North American clinical trial designed to investigate the long-term effects of inhaled antiinflammatory medications in children with mild to moderate asthma (15, 16). This analysis includes nuclear families of 497 non-Hispanic white children. A diagnosis of asthma was based on methacholine hyperresponsiveness (provocative concentration of methacholine causing a 20% fall in FEV1 [PC20] 12.5 mg/ml) and one or more of the following criteria for at least 6 months in the year before recruitment: (1) asthma symptoms at least two times per week, (2) at least two uses per week of an inhaled bronchodilator, and (3) use of daily asthma medication (15). The Institutional Review Board of the Brigham and Women's Hospital (Boston, MA), as well as those of the other CAMP study centers, approved this study. Informed assent and consent were obtained from the study participants and their parents to collect DNA for genetic studies.
Schoolchildren aged 6–14 years with asthma were recruited through 95 schools in the Central Valley of Costa Rica; subject enrollment and phenotyping protocols have been previously described (17). The Central Valley is a relatively genetically isolated population (18), with extensive genealogical records that can be used to trace ancestry back to approximately 4,000 founding individuals in the 1697 census (19). Inclusion criteria included physician-diagnosed asthma, at least two respiratory symptoms (cough, wheeze, or dyspnea) or asthma attacks in the previous year, and a high probability of having at least six great-grandparents born in the Central Valley of Costa Rica. Parents provided written informed consent for themselves and for their children, who also gave written assent. The study was approved by the Institutional Review Boards of Brigham and Women's Hospital and the Hospital Nacional de Niños (San José, Costa Rica).
Using CEPH (Centre d'Etude du Polymorphisme Humain) genotype data from the International HapMap project (Utah residents with northern and western European ancestry) (20), we applied an LD tagging algorithm to capture common variation (r2 > 0.8, minor allele frequency > 0.1) across the genes studied (21). Results for 16 SNPs in ADAM33 have been previously published for CAMP (22). SNPs were genotyped in a highly multiplexed allele-specific hybridization assay with a BeadStation 500G (Illumina, San Diego, CA). Mendelian transmission was tested with PedCheck (23), and inconsistent SNPs and individuals were removed from analysis. See Table E1 in the online supplement for a list of SNPs successfully genotyped in both cohorts.
Pairwise LD was expressed as both D′ and r2, calculated using Haploview (24). To ensure phenotype comparability to the CAMP enrollment criteria, a strict definition of asthma was used in the Costa Rica study, which included methacholine hyperresponsiveness (provocative dose of methacholine causing a 20% fall in FEV1 [PD20] 16.81 μmol) or bronchodilator responsiveness plus the recruitment criteria above. In addition to asthma, two intermediate phenotypes were analyzed: (1) airway hyperresponsiveness (AHR), measured as log10-transformed dose–response slope to methacholine (25); and (2) log10-transformed total serum immunoglobulin E levels (26). Haplotype blocks were defined using the Gabriel algorithm (27), and haplotype-tagging SNPs were identified in Haploview. Family-based association testing and power calculations were performed with PBAT software (28). Additional statistical analyses were performed in SAS version 9.1 (SAS Institute, Cary, NC) and in R. Because these genes have all been associated with asthma phenotypes in prior studies, we used P < 0.05 in both samples to define statistical significance in the setting of multiple tests, instead of an adjusted P value (29).
Characteristics of children with asthma from Costa Rica and non-Hispanic white CAMP participants are shown in Table 3. Participant ages were similar in both studies, which had a predominance of boys consistent with the demographics of childhood asthma. The majority of children in both cohorts were atopic and the distributions of total serum IgE levels were similar. AHR, as measured by methacholine challenge, was also similar, with the vast majority of children in both studies demonstrating a high degree of responsiveness. However, in CAMP PC20 distributions were right censored at 12.5 mg/ml as a result of enrollment criteria.
Ninety-eight SNPs mapping to the five positionally cloned asthma candidates were genotyped in both study populations (see Table E1). Genotype distributions for all SNPs in both cohorts were consistent with Hardy-Weinberg equilibrium (at a threshold P < 0.01), and parent–child genotype inconsistencies were rare: five in Costa Rica and three instances in the new CAMP data (quality control for ADAM33 SNPs has been previously reported ). Genotyping completion rates averaged 99.4% in Costa Rica and 97.5% in CAMP.
The linear correlations in the plots of pairwise r2 values (four of five genes studied are presented in Figure 1) in the CAMP and Costa Rica cohorts for the candidate gene SNPs suggest that despite the known differences in ancestry between our two populations, regional LD at these five loci is similar. In general, there was strong similarity in LD patterns between the two populations, particularly at ADAM33 and GPR154 (NPSR1). r2 Variability was slightly greater in DPP10 and PHF11, particularly for SNP pairs with intermediate degrees of LD (r2 = 0.3–0.7). Only two SNPs were genotyped in HLA-G, so the r2 plot is not presented; for the two HLA-G SNPs, r2 = 0.31 in CAMP and r2 = 0.41 in Costa Rica. Only a small proportion of pairwise comparisons demonstrated r2 0.8, because most SNPs were selected to tag LD at this threshold.
Figure 2 compares the LD patterns in the populations by plotting pairwise D′ in physical order. In the regions encompassing DPP10, GPR154 (NPSR1), PHF11, and HLA-G (data not shown), side-by-side comparison of the LD plots demonstrates that the LD patterns appear strikingly similar, suggesting shared ancestral mutation and recombination events at these loci in these two populations. Interestingly, at the ADAM33 locus, LD extends further in CAMP than in the relatively isolated population from Costa Rica. In general, the LD and haplotype patterns (data not shown) observed are similar at all five loci.
Family-based tests of association for asthma with P 0.05 are presented in Table 4. Of the five candidate genes evaluated, GPR154 (NPSR1) was the only one that demonstrated association with asthma in both populations, with three associated SNPs. The most significant associations were observed with SNP rs1379928 (P = 0.003 and 0.0006 in Costa Rica and CAMP, respectively), which is located 5′ to the risk haplotype described in the initial report (11). However, the direction of these SNP effects are opposite in the two samples, with all three SNPs showing undertransmission of the minor alleles in Costa Rica and overtransmission in CAMP. Of the four other candidate genes tested, only PHF11 demonstrated evidence of association with asthma, with different polymorphisms in each cohort demonstrating allelic transmission distortion (P values, 0.03–0.05). SNP rs9316454 in PHF11 showed a trend toward overtransmission of the minor allele in both cohorts. However, the strength of this association was weak and not statistically significant in either cohort separately. No associations with asthma were observed in either cohort for variants in ADAM33, DPP10, or HLA-G.
In previous reports, associations with AHR have been observed with SNPs in ADAM33 (30, 31), GPR154 (NPSR1) (30–33), and HLA-G (12). In our cohorts, associations with AHR were observed with SNPs in DPP10, GPR154 (NPSR1), and PHF11 (Table 5), although none of these SNP associations overlap in both cohorts. Using PD20 (Costa Rica) or PC20 (CAMP) to measure AHR, the associations with GPR154 rs323917 were strengthened, with P < 0.05 in each cohort (Costa Rica, P = 0.006; CAMP, P = 0.03); the minor allele led to increased AHR in each study. Significant linkages and subsequent associations with total serum IgE levels were the initial findings that resulted in the identification of DPP10, GPR154 (NPSR1), and PHF11 as asthma susceptibility genes, and associations with IgE have also been reported with ADAM33. However, in our cohort there was evidence supporting an association with total serum IgE levels for only one SNP in GPR154 (NPSR1) in the Costa Rica cohort, and one SNP in PHF11 in CAMP (see Table E2).
Family-based analysis of haplotypes within blocks did not show any consistent association with asthma, AHR, or IgE levels across the two cohorts, nor did analysis of SNPs in GPR154 (NPSR1) defining the risk haplotypes in the initial report (data not shown) (11).
The publication of the first genomewide linkage analysis to asthma-related traits generated interest in the potential of positional cloning to identify genetic variants that underlie asthma pathogenesis (34). Using this approach, six asthma candidate genes have been identified, further intensifying hopes that such findings would translate to clinical applications. However, follow-up studies (Tables 1 and and2)2) for several of these genes have yielded inconsistent results, dampening enthusiasm for the initial findings. Multiple reasons have been proposed to explain nonreplication in genetic association studies, including small effect sizes and subsequent lack of power; differences in population ascertainment, phenotype definitions, and trait distributions; differences in LD patterns; and population stratification (35, 36). In the present study, we set out to reproduce associations for five of these genes in two family-based cohorts by eliminating many of these potential problems in replication analyses. In both studies, enrollment criteria, phenotyping protocols, and phenotype definitions were similar. As a consequence, the age, sex, and trait distributions were comparable. The family-based design eliminated the potential effects of population stratification. Despite these strategies, replication results were largely negative, with the exception of GPR154 (NPSR1), where three SNPs replicated across both cohorts. Although the effects of these three SNPs on asthma phenotypes were mostly opposite across the two samples, SNP rs323917 was consistently associated with increased AHR. Lin and coworkers have demonstrated that “flip–flop” associations may be due to different LD patterns between populations or failure to consider gene–gene and/or gene–environment interactions (37).
Gene-level replication was noted for the PHF11 locus, yet the associated SNPs were different in the two samples. Although our results are the first to provide any evidence of independent replication of this locus, the inconsistency in the SNPs associated and the lack of association with serum IgE levels (the quantitative trait that was initially used to map the gene) raise doubts about the significance of the findings. For the remaining three genes, no consistent evidence of association was observed in either cohort. Taken together, these results provide additional support for the relevance of GPR154 (NPSR1) (and perhaps the PHF11 locus) in asthma pathogenesis, but also illustrate the substantial challenges in the replication of genetic associations in asthma, especially among candidate genes identified by linkage analysis and subsequent positional cloning. Genes identified in this manner are likely to be novel and therefore have an unknown relationship to asthma pathobiology. Replication may be easier if the gene is known to be associated with asthma, as these genes may have a known biological function. We have successfully replicated associations with IL-13, a known asthma candidate gene, using the same study design and populations that we have used for the positionally cloned genes in the present study (38).
Inadequate statistical power due to small genetic effects and inadequate sample size is perhaps the most common reason for lack of replication but is an unlikely explanation for our findings. The CAMP and Costa Rican cohorts are each larger than any of those used in the original reports of association; a larger sample size is necessary for a replication study, because the initial effect estimate is often inflated (35). For a disease state such as asthma, with 5% prevalence in the general population, we had 80% power to detect an odds ratio of 1.66 for association with an SNP with 10% minor allele frequency and an odds ratio of 1.50 for an SNP with 20% MAF in the Costa Rica trios; we had power to detect even lower odds ratios in the CAMP study, given the larger number of subjects. For a quantitative trait, there was 80% power to find an association with a 10% MAF SNP that explained 1.2% of the trait variation in Costa Rica. Lower heritability was detectable for more common SNPs and for SNPs in CAMP. We had ample power to detect association in both cohorts for all of these genes, including those for which no association was detected in either cohort.
Among the most important differences between our cohorts is their distinct ancestral histories and resultant genetic architecture. Whereas the children in the CAMP study represent a relatively heterogeneous sample of non-Hispanic whites, those from Costa Rica are descended from the relatively genetically isolated population of mixed Spanish and Amerindian ancestry that populated the Central Valley (39). Although genome-wide comparisons suggest that LD in this population extends over larger distances than in more heterogeneous populations of European ancestry (18), it is likely that regional similarities exist. In fact, LD was largely similar across the five loci studied, although differences were evident, particularly for PHF11 and DPP10. Although GPR154 (NPSR1) had the most similar LD structure across our two samples, the observed “flip–flop” associations may imply specific LD differences with the (untyped) functional locus (see Figure E1 in the online supplement). Regional differences in LD may also explain the lack of replication for the other loci. To date, functional effects have not been demonstrated for the asthma-associated variants in any of the five positional candidate genes. The SNPs evaluated here are unlikely to be causal, but rather in LD with the unidentified functional variants. Replication efforts using these markers should be considered as indirect tests of replication. In this context, even slight differences in LD patterns between populations can affect genotype–phenotype associations, making stringent replication more difficult to achieve. Although several studies have demonstrated portability of LD-tagging SNPs derived from Western European–American genotype data to populations of Spanish (40), Amerindian (41), and U.S. Hispanic origin (42), our data illustrate that this is not always the case, and that the role of genetic heterogeneity should be assessed on a gene-by-gene basis (40–42).
In mapping complex genetic traits in human populations, independent corroboration of findings is among the most important criteria for their validation and acceptance and is now a requirement for publication in many leading journals. In the field of asthma, it has been difficult to meet this rigorous standard, with only the initial report of PHF11 by Zhang and coworkers (9) demonstrating internally consistent replication with respect to the phenotype, marker, and direction of effect in at least one other population. The inherent problems of precise replication for complex traits have been detailed (2, 43, 44). The central assumptions underlying the replication standard are that each cohort is representative of the population from which it is sampled and that all cohorts are largely homogeneous with respect to the collection of genetic and environmental factors that interact, leading to the clinical phenotypes. Although this assumption may hold for monogenic or oligogenic traits with few environmental determinants, this seems unrealistic in complex traits such as asthma, hypertension, and heart disease, in which diverse epistatic and environmental factors play a central role. In these circumstances, only a fraction of risk alleles would be expected to replicate across studies, namely those with high allele frequency whose effects are not heavily influenced by interactions with environmental factors or other genetic loci. Although consistent replication of association provides strong evidence of a gene's importance, the lack of replication does not necessarily render the initial associations invalid. Studies in a single ethnic group may demonstrate the relevance of a candidate gene in that group only, whereas replication in multiple ethnic groups may determine the generalizability of findings across populations in different environments or with different patterns of epistasis.
In our study of five genes and three asthma phenotypes, one must consider the possibility of spurious results due to multiple testing. However, in the statistical genetics literature, there is no clear consensus regarding the optimal methodology to adjust for multiple testing (45). Moreover, most of the available methods are more appropriate for detecting novel associations than for confirming previously reported candidate genes, especially genes identified through positional cloning, which may be expected to have a higher probability of true association (46). Instead of ap-plying a correction factor (e.g., Bonferroni or false discovery rate), we have relied on replication in two samples to protect against spurious results. The use of P < 0.05 to confirm previously iden-tified candidate genes has been applied in a genomewide association study of type 2 diabetes (29). Even with this liberal threshold, we did not find strong evidence of replication. To limit the num-ber of tests, we considered only the three asthma phenotypes most commonly reported in previous studies (Tables 1 and and2)2) and did not examine gene–gene and gene–environment interactions. Examination of additional phenotypes and consideration of interaction effects are possible avenues for future studies.
In summary, we have evaluated five asthma susceptibility genes identified by positional cloning for evidence of association in two childhood asthma cohorts, and provide evidence of replication for GPR154 (NPSR1), although the associated alleles were mostly opposite across the two study samples. Our data and prior studies suggest that common genetic variation in GPR154 (NPSR1) influences asthma phenotypes in populations of differing ethnicity. Although the functional variants for this gene have not been conclusively identified, further investigations into the precise role of GPR154 (NPSR1) in asthma are warranted. In addition, our work highlights the challenges of replicating genetic associations in complex traits such as asthma, especially for genes identified by positional cloning. Innovative strategies, such as integration of gene expression data and consideration of epistatic interaction in gene–gene networks, are likely to increase the statistical power of genetic association studies and thus may aid in the future identification of novel asthma candidate genes.
The authors thank the participating families from the Genetic Epidemiology of Asthma in Costa Rica Study and from the CAMP Genetics Ancillary Study for their enthusiastic cooperation. We also acknowledge the CAMP investigators and research team, supported by the NHLBI, for collection of CAMP Genetic Ancillary Study data, and the members of the field team in Costa Rica. The authors also thank Ankur Patel for assistance with genotyping. All work on data collected from the CAMP Genetic Ancillary Study was conducted at the Channing Laboratory of the Brigham and Women's Hospital under appropriate CAMP policies and human subjects protections.
Supported by NIH grants HL66289, HL04370, and HL074193 (Genetic Epidemiology of Asthma in Costa Rica Study) and by NHLBI grant N01-HR-16049 (CAMP Genetics Ancillary Study). Additional support for this research came from grants U01HL065899, P50HL67664, and T32HL07427. C.P.H. and B.A.R. are recipients of Mentored Clinical Scientist Development Awards (K08 HL080242 and HL074193, respectively); C.P.H. is also supported by a grant from the Alpha-1 Foundation.
This article has an online supplement, which is accessible from this issue's table of contents at www.atsjournals.org
Originally Published in Press as DOI: 10.1164/rccm.200704-592OC on August 16, 2007
Conflict of Interest Statement: C.P.H. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. B.A.R. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. M.E.S.-Q. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. A.J.M. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. L.A. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. J.L.-S. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. J.S.S. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. B.J.K. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. C.L. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. S.T.W. was a consultant for Pfizer from 2002 to 2003, Schering Plough from 1999 to 2000, Variagenics in 2002, Genome Therapeutics in 2003, and Merck Frost in 2002. S.T.W. received a $500,000 grant for genomic equipment from Roche Pharmaceuticals from 2002 to 2003 and was a consultant for Roche Pharmaceuticals in 2000, but did not receive any financial compensation for this consultancy. He received $10,000 from 2005 to 2006 for serving as an advisor and chair of the advisory board to the TENOR Study for Genentech, and a $900,065 grant from AstraZeneca from 1997 to 2003 for the Asthma Policy Modeling Study. S.T.W. was a coinvestigator on a grant from Boehringer Ingelheim to investigate a COPD natural history model from 2003. J.C.C. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript.