|Home | About | Journals | Submit | Contact Us | Français|
Genome-wide genotyping of a cohort using pools rather than individual samples has long been proposed as a cost-saving alternative for performing genome-wide association (GWA) studies. However, successful disease gene mapping using pooled genotyping has thus far been limited to detecting common variants with large effect sizes, which tend not to exist for many complex common diseases or traits. Therefore, for DNA pooling to be a viable strategy for conducting GWA studies, it is important to determine whether commonly used genome-wide SNP array platforms such as the Affymetrix 6.0 array can reliably detect common variants of small effect sizes using pooled DNA. Taking obesity and age at menarche as examples of human complex traits, we assessed the feasibility of genome-wide genotyping of pooled DNA as a single-stage design for phenotype association. By individually genotyping the top associations identified by pooling, we obtained a 14- to 16-fold enrichment of SNPs nominally associated with the phenotype, but we likely missed the top true associations. In addition, we assessed whether genotyping pooled DNA can serve as an inexpensive screen as the second stage of a multi-stage design with a large number of samples by comparing the most cost-effective 3-stage designs with 80% power to detect common variants with genotypic relative risk of 1.1, with and without pooling. Given the current state of the specific technology we employed and the associated genotyping costs, we showed through simulation that a design involving pooling would be 1.07 times more expensive than a design without pooling. Thus, while a significant amount of information exists within the data from pooled DNA, our analysis does not support genotyping pooled DNA as a means to efficiently identify common variants contributing small effects to phenotypes of interest. While our conclusions were based on the specific technology and study design we employed, the approach presented here will be useful for evaluating the utility of other or future genome-wide genotyping platforms in pooled DNA studies.
With the advent of arrays designed to genotype 500,000 or more SNPs per array, genome-wide association (GWA) studies have identified many SNPs reproducibly associated with various complex traits such as diabetes, height, and obesity (Hindorff et al. 2009; McCarthy et al. 2008). However, most reproducible associations identified through GWA studies represent SNPs with modest effect sizes (odds ratios typically ranging from 1.1 to 1.3 (Hindorff et al. 2009; Iles 2008)), and thus a study design comprising tens of thousands of individuals is required to have sufficient power to detect such associations. Therefore, although genotyping technology has now advanced to the point where it is possible to genotype up to one million SNPs on a single array, the cost of a well-powered GWA study is often prohibitively expensive given the current cost of roughly $500–700 per array. In situations with limited funding (e.g., when investigating an understudied disease), one possible way to reduce the cost of initial association screening is to genotype pools of DNA from multiple individuals rather than to genotype each individual separately.
Genome-wide genotyping of pooled DNA has previously been used to resolve individuals contributing trace amounts of DNA to a pool (Homer et al. 2008a) and to estimate global ancestry and identify ancestry informative markers (Chiang et al. 2010). For the purpose of conducting GWA studies, numerous analytical tools, approaches, and statistical considerations have been proposed (Craig et al. 2009; Docherty et al. 2007; Homer et al. 2008b; Macgregor et al. 2008; Meaburn et al. 2006; Pearson et al. 2007; Visscher and Le Hellard 2003; Yang et al. 2008; Zhang et al. 2008). However, using pooled genotypes to identify common variants associated with disease, the original goal of genotyping pooled DNA, has had mixed success. Known variants associated with Alzheimer’s disease, progressive supranuclear palsy, eye color, pseudoexfoliation syndrome, and age-related macular degeneration have been detected by pooling only several hundred samples in proof-of-principle experiments (Craig et al. 2009; Pearson et al. 2007). Novel variants as well as variant sets that are likely enriched for potential true associations have also been identified using pooled DNA for melanoma susceptibility (Brown et al. 2008), memory performance (Papassotiropoulos et al. 2006), schizophrenia (Shifman et al. 2008b), otosclerosis (Schrauwen et al. 2009), mild mental impairment (Butcher et al. 2005), general cognitive ability (Butcher et al. 2008), mathematical ability (Docherty et al. 2010), and reading ability (Meaburn et al. 2008). Despite these success stories, the common theme among them is that the identified variants all have large effect sizes, with odds ratios ranging from ~1.7 to >15. On the other hand, variants with small effect sizes may not be amenable to identification by pooled genotyping, since the pooling-specific error could have a greater impact (see Pearson et al. 2007 for discussion). Because of these concerns, genotyping pooled DNA has been proposed as a quick and inexpensive screen of understudied diseases for common variants with large effect sizes (Chiang et al. 2010). However, for DNA pooling to serve as a viable cost-saving strategy for conducting GWA studies, the efficacy of genotyping pooled DNA to identify variants of small effect sizes needs to be evaluated.
We performed our evaluation in two ways. First, we evaluated the efficacy of genotyping pooled DNA to identify common variants of small effect in the setting of a single-stage association study design; second, we assessed whether genotyping pooled DNA can be used as an inexpensive screening stage of a multi-stage design with a large number of samples.
To explore the utility in a single-stage design, we used obesity (as measured by body mass index, BMI) and age at menarche as examples of complex traits for which no common variants with large effect size have been reported (Gajdos et al. 2010; Kang et al. 2010; Ong et al. 2009; Speliotes et al. 2010). We individually genotyped the top associated SNPs identified by pooling to assess the enrichment of validated associations using a pooled design. We note that our focus is on evaluating whether we could identify SNPs with real between-group allele frequency (AF) differences among the actual individuals pooled in our study, as a means to examine the efficacy of the methodology. If the analysis of pooled GWA data is technically robust to small differences in AF, the top associated SNPs from the pooled analysis should show AF differences when genotyped in the same individual samples. Our results should not be interpreted as generalizable findings of associations, as apparent differences in AF between the case and control pools of the size described here are much more likely to reflect sampling variation (and potentially uncorrected error due to pooling as well) than reproducible associations with the phenotype of interest.
To determine if the current state of the specific genotyping technology we employed would generate pooled genotypes useful for a multi-stage GWA design searching for variants with small effects, we compared the most cost-efficient 3-stage designs assuming 50,000 individuals, with or without pooling a subset of the individuals in the second stage. Here we relied on a cohort of African-Americans for whom we have both individual and pooled genotype data to model the efficiency of pooling (Chiang et al. 2010).
We found that we could enrich for nominal associations in a single-stage design by focusing on the top SNPs associated with obesity and age at menarche in a pooled GWA study, although, as expected with an imprecise methodology, the evidence of association was over-estimated in the pooled data and the true top associations were likely missed. In multi-stage designs, the cost of the most optimal 3-stage design involving pooled samples appears to be similar to, but not cheaper than, that of a 3-stage design not involving pooled samples. Together, these results suggest that pooled genotyping, as used here, may not be more cost-efficient for identifying variants with small effect sizes, although future changes in genotyping costs or further improvements in genotyping technology may make pooled genotyping a useful component of future GWA study designs.
We constructed four case pools (N = 101, 136, 137, and 137) and two control pools (N = 79 and 98) from 688 individuals recruited as part of a study of gene by environment effects from Kingston, Jamaica (GXE), as well as two case pools (N = 129 and 135) and two control pools (N = 105 and 111) from 480 individuals from Spanish Town, Jamaica (SPT). Pools were constructed according to high (case) and low (control) BMI, stratified by gender (see “Methods”). Each pool was genotyped in triplicate using the Affymetrix 6.0 SNP array. We estimated pooled AF and applied four SNP quality control (QC) filters, as previously described (Chiang et al. 2010). The QC filters were effective in removing the vast majority of SNPs with apparent very large AF differences that are likely to be false positives in an association study (Fig. 1a, b). Moreover, because we were looking for SNPs with very small AF differences, we corrected a systematic bias in AF estimation of SNPs with relatively low AF in the case or control pools by fitting the AF distribution of each pool within a cohort to the overall distribution of AF (see “Methods”). This procedure allowed the distribution of AF in the case and control pools to be much more comparable and symmetric around the null expectation line of equal AF in cases and controls (Fig. 1c).
The Jamaican pools (GxE and SPT) were separately analyzed to identify variants associated with BMI (see “Methods”). SNPs that passed the QC filters in both cohorts were then meta-analyzed, weighted by the sample size of each cohort. In total, we assayed 252,647 SNPs and observed no obvious signals above the null (Fig. 2a). We divided the top 1,000 SNPs after meta-analysis into three categories, based on the association ranks assigned to the surrounding SNPs in LD with the SNP of interest: “encouraging” associations have other SNPs in LD showing good evidence of association; “discouraging” associations have other SNPs in LD showing poor evidence of association; and “inconclusive” associations have mixed signals or have no other SNPs in LD (see “Methods”). We successfully genotyped a total of 21 SNPs out of the top 36 “encouraging” (ENC) and the top 7 “inconclusive” SNPs (limiting the inconclusive SNPs to those with no SNPs in LD, INC-LD) in the individuals that comprised the GxE and SPT pools, and analyzed them using the same case/control study design. Fourteen of the 21 SNPs showed evidence of association with BMI with P <0.05 in meta-analysis, with an additional two SNPs with P <0.05 in one population that were not genotyped successfully in the other (Table 1). The 14-fold enrichment of SNPs with P <0.05 (16, when 1.15 were expected if SNPs were randomly selected) demonstrates that a GWA study using pooled samples may be an effective method to screen for potential associations in a staged design.
For identifying SNPs associated with age at menarche, we constructed five pools each of women with early and late age at menarche from the Multi-Ethnic Cohort (MEC) of Los Angeles and Hawai’i (see “Methods”), for which we have some past genotype data on individual participants useful for validation of the pooled analysis. In total, 745 women with early menarche (N = 168, 132, 91, 169, and 185) and 729 women with late menarche (N = 153, 120, 111, 163, and 182) were pooled, stratified by racial/ethnic cohort. The MEC pools were analyzed similarly to the Jamaican pools. Association analyses were conducted within cohort before meta-analysis, and only SNPs passing QC filters in at least three of the five cohorts were meta-analyzed. To account for different AF across diverse panels during meta-analysis, each cohort was weighted by its effective population size, which was calculated based on pooled AF estimates (Gajdos et al. 2008). In total, we assayed 390,175 SNPs and observed no obvious deviation from the null expectation (Fig. 2b). Among the top 2,000 associations, 25 SNPs had been previously individually genotyped in our laboratory (Table 2). Among these 25 SNPs, 15 would have been categorized as ENC or INC-LD (see “Methods”), and 12 out of the 15 SNPs achieved a P <0.05. This represents a 16-fold enrichment with P <0.05 (12, when only 0.75 was expected if SNPs were randomly selected), comparable to that observed for the analysis of BMI above. Of the remaining 10 SNPs in the “inconclusive” (INC, because SNPs in LD showed marginal evidence of association in pooling) and “discouraging” (DIS) categories, half of the SNPs in each category reached a nominal P <0.05 (Table 2). Though limited in number, this suggests that the evidence of association of neighboring SNPs in LD can provide additional useful information for discerning real associations from false positives.
While we were able to enrich for nominal associations by analyzing pooled DNA samples, we tended to over-estimate the evidence of association by at least an order of magnitude (Tables 1, ,2),2), and likely missed a number of the SNPs most differentiated between case and control pools due to both errors associated with pooling and loss of SNP coverage due to QC. For example, given that ~252,000 SNPs were assayed for BMI, the best P value should be on the order of ~4 × 10−6 under the null hypothesis, and ~9 to 10 SNPs should have had P values more significant than our best associated signal (rs4793952, P = 4.1 × 10−5, Table 1), but were not successfully detected among the top SNPs we individually validated. This implies that the pooling approach may not be adequate enough to serve as a 1-stage design or a first-stage screen. As the most efficient design for GWA studies is typically a multi-stage design (Skol et al. 2006), the power and potential cost reductions of various 2-stage designs involving pooled genotype data have also been investigated (Macgregor et al. 2008; Zuo et al. 2008, 2006). Similarly, we assessed the utility of genotyping pooled DNA to act as a screening stage in a multi-stage design, so to enable fewer putatively associated variants to be genotyped in subsequent stages without substantial loss of power. We conducted our assessment in two ways: first, by empirically determining if information from pooled cohorts would allow better prioritization on GWA signals for follow-up; second, by simulation to compare the most optimal 3-stage study designs with or without pooling.
Specifically, we first considered a scenario in which a GWA study using individual genotyping had already been done with a discovery cohort, and the researchers have additional samples available for replication but with only limited funding for genotyping. In this case, we asked if having pooled information would allow one to prioritize which SNPs to follow up in replication (Fig. 3a). To test this possibility empirically, we used data from a previously conducted GWA study of BMI in African-derived populations (Kang et al. 2010). In this study, all independent SNPs with P <1 × 10−5 in the discovery stage were screened in additional panels for consistency in the direction of effect (screening stage). In the last stage (replication stage), seven SNPs satisfied the conditions of the first two stages and were genotyped in the same Jamaican populations that had been pooled here (Kang et al. 2010). To determine if the availability of pooling data would assist prioritization of SNPs to be followed up, we also individually genotyped 11 additional SNPs that were associated with BMI with 1 × 10−5 <P <0.001 (in the discovery phase, i.e., in the African-derived cohorts) that had also achieved a one-tailed quantile-normalized P <0.1 in the pooled analysis (Fig. 3a). Among these 11 SNPs, 5 reached a one-tailed P <0.05, and one additional SNP achieved a one-tailed P <0.07 (Table 3). In comparison, none of the 7 top SNPs identified in the original GWA study achieved one-tailed P values <0.05 in GxE and SPT (Table 3), despite having stronger evidence of association in the discovery stage than the 11 SNPs identified with added pooling information and despite having passed the screening stage of that study design. This suggests that while the substantial false-negative rate due to stringent filtering of the pooled data may make pooled genotyping less desirable as a first-stage design, it may be useful for supplementing the list of putative associations for follow-up in multi-staged design, especially by providing additional association evidence (in pooling) for SNPs that only showed modest association in the discovery phase.
To further determine if incorporating information from pooled DNA would produce more cost-efficient designs of multi-staged association studies, we used simulations. We assumed a scenario in which a case–control cohort of 50,000 individuals is available for a 3-stage analysis designed to have 80% power to detect common variants (AF = 0.3) with a genotypic relative risk (GRR) of 1.1, a modest effect size consistent with the effect sizes seen among variants associated with common complex diseases (Hindorff et al. 2009; Iles 2008). If no pooling were involved, we assumed that three subsets of the 50,000 individuals would each be genotyped in different stages, first by a genome-wide SNP array and the latter two by individual custom genotyping. The most cost-effective design under this scheme is then compared to a design where all samples used in stage 2 were genotyped genome-wide as pooled DNA, while maintaining the overall 80% power and penalizing the increased rate of false positives that would arise with pooling (see “Methods”). Because of the introduction of pooling, the optimal number of individuals to genotype in each stage is different under the two schemes (N1A, N2A, and N3A with all-individual genotyping; N1B, N2B, N3B when subset of individuals were pooled (see “Methods”, Fig. 3b)). Note that our simulation only aims to model the cost of the genotyping itself, and thus does not take into account any alterations in the analytical cost due to pooled data or any potential reduction in speed associated with pooled genotyping.
Without pooling, the most efficient design involves genome-wide genotyping of 15,000 individuals in stage 1, followed by genotyping the top 0.1% of the SNPs in 8,000 additional individuals in stage 2. Those that achieved a one-tailed P <0.1 would then be genotyped in the remaining 27,000 individuals in stage 3. This study design has 80% power to detect a common variant (AF = 0.3) with GRR = 1.1 that passes through the first two stages and achieves a one-tailed P = 0.0025 in stage 3, when under the null only 0.2 SNPs would be expected (Table 4). If a pooling design replaced the stage 2 screen and pooled samples were genotyped by a genome-wide platform, the optimal design would involve genotyping 14,000 individuals in stage 1, and 28,000 individuals in pools of 150 in stage 2. SNPs in the top 0.5% of stage 1 that also achieved a quantile-normalized one-tailed P <0.35 in stage 2 would then be genotyped in the remaining 8,000 individuals in stage 3. A true association would then have 80% chance to reach a one-tailed P <0.1 in stage 3. Under the null, 175 such SNPs would exist, and we penalize this increased false positive rate by requiring all 175 SNPs to be individually genotyped in the 28,000 individuals that were assayed in the pools. With a cohort of 28,000 individuals, one would have >99.8% power to detect a SNP with a one-tailed P ≤ 0.003 (the top association among 175 SNPs under the null). Under this optimal design involving pooling, the cost of the entire study would be 1.07 times more expensive than without pooling (Table 4), suggesting that pooled genotyping may not be cost-saving using the platform and methods described here.
The cost-benefit simulation described here is based on current genotyping costs, as well as the efficiency of pooling using the Affymetrix 6.0 platform. We re-simulated the most cost-efficient design with and without pooling with varying conditions of array-to-individual genotyping cost ratios and pooling efficiencies (see “Methods”) to determine if improvements in technology or methodology will make a study design involving pooling more favorable. We found that in general a design with pooling would be more favorable if the cost of the genome-wide SNP array were to decrease, or if the efficiency of pooling were to improve. Conversely, the all-individual genotyping scheme would be favored if individual multiplex custom genotyping cost were to decrease (Fig. 4). Specifically, if the array/individual genotyping cost ratio were to become 1/4 or 1/2 as expensive, or perhaps more importantly if the pooling efficiency were to improve such that the same amount of truly associated SNPs will attain pooling P values that are 1/2 or 1/3 of the current pooling P values, the cost of the entire study could be 0.91 times that of the design without pooling, assuming a GRR = 1.1, a variant AF of 0.3, and target power of 80% (Fig. 4).
In order for genome-wide genotyping of pooled samples to be a viable cost-saving alternative to individual-level GWA studies, which was the original goal of pooled genotyping, it is necessary to be able to identify associations with small effect sizes, especially given that many common variants irrefutably associated with complex traits have small effect sizes (Hindorff et al. 2009; Iles 2008). Failures of DNA pooling studies to identify variants reproducibly associated with a number of traits, such as neuroticism (Shifman et al. 2008a) and general cognitive ability (Davis et al. 2010), are consistent with the notion that pooled genotyping may be ineffective for identifying variants with small effect sizes, especially since thousands of individuals were pooled in these studies (although other factors such as study design or inadequate power may also be the cause). However, a systematic assessment of the utility of pooling for identifying such variants had not been conducted previously.
In the present study, we empirically evaluated the use of pooled genotype data obtained using the Affymetrix 6.0 platform to detect associations with obesity and age at menarche. We were able to obtain a 14- to 16-fold enrichment of SNPs nominally associated with our phenotypes (Tables 1, ,2),2), even though the substantial false-negative rate likely prevents our strategy from being used in a single-stage design. Moreover, despite the fact that a great deal of information exists within pooled genotype data, a pooled GWA study design as implemented here does not appear to represent a savings in study cost as compared to a GWA study design employing solely individual genotyping (Table 3).
In order to reliably detect associated variants with small effect sizes, we stringently applied QC filters and also re-parameterized the AF distribution of our pooled data. Concurrent with stringent control of false positive rate is the loss of power due to loss of coverage. Our previous work has shown that the false-negative rate due only to the QC filters applied here is approximately 51% (Chiang et al. 2010). As a result, we were only able to assay ~300,000 to ~500,000 SNPs in each of the pools, and as few as ~250,000 SNPs in meta-analysis for phenotype association. Additionally, while we were able to enrich for nominal associations among the top associated signals by pooling, we tended to over-estimate the effect sizes and probably missed the most differentiated SNPs, likely reflecting pooling-specific errors that remain uncorrected or not modeled in our study design. Taken together, even though it is evident that a great deal of useful information exists in pooled genotype data, it may be difficult, under the conditions employed here, to use pooled genotyping for either a single stage or for the first stage of a multi-stage GWA study design. Instead, it may be useful to consider this information for obtaining additional variants to supplement the list of top putative associations (typically identified in the discovery phase of a GWA study) for follow-up genotyping (Table 3), particularly if the pooled data are readily available.
We also investigated the efficacy of pooled genotyping in a multi-stage design. While a previous report suggested that using pooled samples in a 2-stage design could achieve an approximately 20-fold reduction in study cost compared to all-individual genotyping (Macgregor et al. 2008), the simulation was based on a 30-fold increase in the per genotype cost of individual custom genotyping as compared to genome-wide array genotyping. However, the cost ratio today is closer to 200-fold due to a drop in the price of genome-wide array platforms. This cost reduction makes it more attractive to genotype samples individually in the first stage. We compared the two most cost-efficient 3-stage designs, with and without pooling. We focused on a 3-stage design for two reasons: first, the high false-negative rate as a result of stringent QC implies loss of coverage and power. Thus, we reasoned that a better use of pooled data is to perform little or no QC, and use pooling as a screen in the second stage followed by additional replication. Second, given the need for very large sample sizes to detect reproducible associations with small effects, we reasoned that for both all-individual genotyping and pooled genotyping the optimal cost scenario is likely a 3-stage design with successively fewer markers genotyped rather than a 2-stage design. Indeed, with a sample of 50,000 individuals, a 3-stage design is more cost-efficient than a 2-stage design (Supplemental Table 1). Furthermore, to model pooling efficiency, we relied on the empirical data generated from an African-American cohort pooled with respect to BMI. Because few SNPs with large effect sizes exist in our empirical data, we were prevented from simulating the optimal design using much higher GRR. Given these conditions, and using current estimates of $0.00065 per genotype for the Affymetrix 6.0 array and $0.13 per genotype for multiplex genotyping, we determined that a 3-stage design using pooling would not present a reduction in cost, but would in fact represent a 7% cost increase, when compared to a 3-stage design using only individual genotyping (Table 4).
However, it should be noted that the results presented here are limited by both the current state of the specific technology and methodology we employed, as well as by our experimental design. First, our power to detect associated variants may also be limited by the number of replicates genotyped to control for pooling-specific error. Increasing the number of replicates may improve pooling efficiency, although it would also increase the study cost. Moreover, advances in the genotype platform and improvements in sample handing may enhance the overall data quality and AF estimates. For example, increased SNP density and redundancy on the chip should increase SNP coverage and lower the associated false-negative rate, thereby improving pooling efficiency. Indeed, a few of the reproducible associations identified using DNA pooling (Brown et al. 2008; Papassotiropoulos et al. 2006; Schrauwen et al. 2009) may partly be due to the investigators using a genotyping platform in which the reported pooling-specific error is approximately ten times lower than the platform used in this study (Craig et al. 2009; Macgregor et al. 2008). We have also speculated that a 3-stage design involving pooling would be more cost effective if either the cost of array genotyping were to decrease or pooling efficiency were to increase (Fig. 4). Therefore, a re-evaluation altering these different parameters of the study design may produce a different conclusion. The approach described here will be useful for assessing the effects of various parameters on pooling, as well as for evaluating the utility of other existing genome-wide genotyping platforms or of new, potentially more robust, SNP array platforms for conducting GWA studies using pooled DNA.
For the moment, it appears that genotyping pooled DNA remains most useful in scenarios in which either a large genetic effect is present or genome-wide level data are used concurrently to overcome errors in individual marker estimates, such as the case of estimating global admixture proportions (Chiang et al. 2010) or resolving individuals contributing trace amounts of DNA to a pool (Homer et al. 2008a). It should be noted that for many researchers, limited funding prevents a comprehensive GWA study using individual DNA samples from being conducted. In these cases, genotyping pooled DNA in a single-stage design does enrich for nominal associations and remains one of the affordable (albeit less efficient) options. Moreover, given a general lack of a priori expectation of the effect sizes for the variants associated with a previously understudied phenotype, conducting single-stage DNA pooling studies may be informative for the initial characterization of the genetic architecture of the disease as well as for the design of future association studies. For example, one may initially genotype a relatively small cohort in pools to screen for variants of large effects. If no such variants exist, a larger cohort could then be accrued and individually genotyped using a genome-wide array. The top associations from the GWA study discovery cohort could be combined with data from the preliminary pooled DNA study (to include other variants that may not have as strong evidence of association) to prioritize SNPs to be genotyped in a third, replication, cohort.
The cohorts used in this study include Jamaican individuals from Kingston (GxE) and Spanishtown (SPT), Jamaica, respectively, as described elsewhere (Chiang et al. 2010; Cooper et al. 1997; Kang et al. 2010), as well as African American (MEC-AA), Native Hawaiian (MEC-H), Japanese American (MEC-J), Latin American (MEC-L), and non-Latina white (MEC-W) women from the Multi-Ethnic Cohort (MEC) of Los Angeles and Hawai’i, as described elsewhere (Chiang et al. 2010; Gajdos et al. 2008; Kolonel et al. 2000). In total, 20 pools were constructed from these cohorts, stratified by phenotype and gender.
In the GxE panel, individuals with BMI <24 were designated as unaffected controls, and individuals with BMI >28 were designated as affected cases. Males and females were pooled separately, but using the same BMI cut-offs. In the SPT panels, individuals with BMI<25 were designated as unaffected controls and individuals with BMI >30 were designated as affected cases. Final sizes for the GxE pools were 101 case males, 79 control males, 136, 137, and 137 case females, and 98 control females. Final sizes for the SPT pools were 129 and 135 case females and 105 and 111 control females.
The MEC samples were pooled by early (<11 years of age) or late (>14 years of age) menarche. The final MEC pools consisted of 321 MEC-AA individuals (153 late menarche and 168 early menarche), 252 MEC-H individuals (120 late and 132 early), 202 MEC-J individuals (111 late and 91 early), 332 MEC-L individuals (163 late and 169 early), and 367 MEC-W individuals (182 late and 185 early).
Pool construction has been described previously (Chiang et al. 2010). Briefly, DNA samples were pooled in an equimolar fashion, and pooled DNA were genotyped with the Affymetrix Genome-Wide Human SNP 6.0 Array according to the manufacturer’s instructions (http://www.affymetrix.com/support/technical/byproduct.affx?product=genomewidesnp_6) in triplicate, along with independent individual DNA samples on the same genotyping plate. Pool replicates with excessively low intensity, low call rate, or high heterozygosity compared to the other replicates of the same pool were either re-genotyped or removed: one of the MEC-L late replicates and one of the MEC-J early replicates were each re-genotyped along with a companion replicate on a later plate; one replicate from each of three different pools (GxE control males, GxE case females, and SPT control females) were dropped from the study.
Pooled AF in the case and control pools of each cohort were estimated as described previously (Chiang et al. 2010). Briefly, the pooled samples were genotyped genome-wide on the same plate with other individual DNA samples. The Birdseed algorithm (Korn et al. 2008) was used to estimate AA, AB, and BB cluster means and covariances of probe intensities for the individual samples as well as to call the genotypes for these samples. Informed by the covariance matrix of the three genotype clusters, for each pool we estimated the pooled AF as the proportion of angular distance observed for the pooled sample relative to that observed for the individual samples on the same plate, averaged over all replicates. SNPs were then subject to QC filtering, and post-QC SNPs were further re-parameterized before analysis (see below).
We applied four QC filters to remove SNPs that tended to genotype poorly or inconsistently in pooled DNA samples when using the Affymetrix 6.0 platform. The four filters include the Fisher’s Linear Discriminant (FLD) filter, the radius of pool intensity (r) filter, the minor allele frequency (MAF) filter, and the historical data (hist) filter. Detailed descriptions of each filter can be found elsewhere (Chiang et al. 2010). The FLD filter uses the Fisher’s Linear Discriminant as a measure of separation between genotype clusters; a higher value indicates better confidence in the clustering of individuals and yields more accurate estimates of AF. The r-filter measures the radial component of the signal intensity of the pooled samples in polar coordinates, relative to the intensity measured for the individual samples; a low value may indicate poor DNA quality or low quantity. The MAF-filter measures the estimated pooled AF and removes rarer SNPs because the pooled AF estimates tend to be less accurate for such SNPs. Finally, the hist-filter measures the variance in the difference in estimated AF and actual AF among replicate pools based on historical data (Chiang et al. 2010); a large variance may indicate a SNP probe on the array that is not robust through the pooling scheme. Each filter was previously trained by comparing pooled association results to those based on genome-wide individual genotyping to establish suitable cut-off parameters. The same filter cutoff parameters were employed for both the Jamaican pools and the MEC pools; the filter cut-offs were: FLD filter ≥20, r-filter ≥0.8 in at least 80% of the replicates, MAF-filter ≥0.05, and hist-filter ≤0.06. The number of SNPs passing QC ranged from 303,627 to 499,969.
To ensure similarity in the AF distributions of the case and control pools during disease association, we also adopted a non-parametric approach to estimate AF for all SNPs passing the initial QC filters. For each cohort, an overall distribution of AF estimates was constructed by combining the pooled AF estimates from each case and control pool, weighted by the sample size of each pool, for all post-QC SNPs. For each pool, post-QC SNPs were then ranked based on pooled AF, and assigned an AF of the corresponding rank from the overall distribution. This measure improved the genomic control inflation factor, lambda, from 1.38 (after QC filters only) to 1.08 in an analysis of African-Americans pooled by high and low BMI, as described previously (Chiang et al. 2010).
For identifying SNPs associated with BMI, we compared the pooled AF (after non-parametric fitting to the overall AF distribution) from the case pool to the control pool, and then deflated the resulting χ2 statistics to control for pooling-specific error (Chiang et al. 2010; Visscher and Le Hellard 2003). The association statistics were quantile-normalized within cohort before meta-analysis to remove any systematic bias between cohorts that might cause one cohort to dominate the meta-analyzed signal. Only SNPs that passed QC filters in both GxE and SPT were meta-analyzed using the meta-analysis tool Metal (February 2009 release, http://www.sph.umich.edu/csg/abecasis/Metal/index.html), weighted by population size. The top 1,000 SNPs from the meta-analyzed results were divided into encouraging, inconclusive, and discouraging categories based on the strength of association evidence in neighboring SNPs in LD (r2 >0.5 in HapMap YRI panel) with the SNP of interest (also see Chiang et al. 2010). Briefly, data from the meta-analysis of pre-QC-filtered dataset were used as indicators of the strength of association of SNPs in LD with one of the top SNPs: “Encouraging” (ENC) SNPs had at least one SNP in LD with a quantile-normalized P value <0.05 and had at least half of the SNPs in LD with quantile-normalized P values <0.1; “Discouraging” (DIS) SNPs had none of the SNPs in LD with quantile-normalized P values <0.1; the remaining SNPs were categorized as “Inconclusive” (INC), a category also encompassing SNPs with no other SNPs in LD (INC-LD). After assignment into the appropriate categories, the top encouraging and inconclusive SNPs were selected for validation by individual genotyping.
For age at menarche, the analysis was conducted in the same way, with the following changes: first, each MEC population was weighted by its effective sample size during meta-analysis, which was calculated based on the pooled AF in each population, in order to account for population-specific differences in AF (Gajdos et al. 2008). Second, we meta-analyzed all SNPs that passed QC in at least three of the five MEC cohorts. Third, the LD pattern used for categorical assignment of top SNPs based on strength of association of neighboring SNPs in LD depended on the population that contributed most strongly to the meta-analyzed P value; the HapMap reference panel corresponding to the major ancestry of the population was used.
We compared two 3-stage designs, with and without pooling (Fig. 3b). Without pooling, we assumed samples were individually genome-wide genotyped in the first stage, and then individually custom-genotyped in the second and third stages. With pooling, we assumed that samples were individually genome-wide genotyped in the first stage, genome-wide genotyped in pools in the second stage, and then individually custom-genotyped in the third stage.
We first determined the most cost-efficient 3-stage design without pooling any samples. We assumed a cohort of 25,000 cases and 25,000 selected controls, a disease prevalence of 0.1, a disease variant frequency of 0.3, and a GRR of 1.1 under the multiplicative model. Using these parameters, we obtained the expected AF among the cases and the controls from the “case–control for discrete traits” module of the genetic power calculator (http://pngu.mgh.harvard.edu/~purcell/gpc/). We systematically iterated over different permutations of individuals apportioned to each of stages 1 and 2 (n1, n2), in increments of 1,000, with the remaining individuals assigned to stage 3 (n3). The case-to-control ratio was 1 in all stages. For each permutation of n1, n2, and n3, the power at each stage was computed for 17 different levels of alpha (a1, a2, and a3;1 × 10−7, 5 × 10−7, 1 × 10−6, 5 × 10−6, 1 × 10−5, 5 × 10−5, 1 × 10−4, 5 × 10−4, 0.001, 0.005, 0.01, 0.05, 0.1, 0.2, 0.3, 0.4, and 0.5). Two-tailed power was calculated for stage 1; one-tailed power was calculated for stages 2 and 3. The total power of the 3-stage design is the product of the power at each of the three stages and was calculated for all permutations of alpha levels for each stage.
The pricing for genome-wide genotyping using the Affymetrix 6.0 SNP array and for custom genotyping using the Sequenom MassARRAY platform were obtained from the Children’s Hospital Boston Microarray Core and SNP Genotyping Core Facilities as of February 2010. The full price of the Affymetrix 6.0 array (~1 million SNPs) for outside investigators was $650 per sample; the custom genotyping cost was $1,100 per genotyping plate of 384 individuals plus $20 per SNP primer design (typically 28–35 SNPs are designed and assayed together). These figures translate to a cost of ~$0.00065 per genotype for the stage 1 simulation and ~$0.13 to $0.15 per genotype for custom genotyping in subsequent stages. For simplicity, we assumed a 200× difference between the per genotype cost of genome-wide genotyping and custom genotyping (i.e., $0.00065 and $0.13). This estimated pricing includes the labor costs, but not the analytical costs.
The total cost for each all-individual 3-stage design was based on genome-wide genotyping in stage 1, with custom genotyping in the individuals apportioned to subsequent stages only for SNPs that would surpass the alpha level in the previous stage. We considered the optimized design to be the one with the lowest cost that achieved at least 80% power with <1 false positive (the number of SNPs expected to reach the alpha levels for each stage under the null).
The most cost-efficient design involving pooled genotyping in stage 2 was determined similarly, with the following modifications. When determining the power of the second stage, we needed to consider the effect of pooling. Thus, we multiplied the power due to genotyping n2 individuals at alpha level a2 by the probability that the association statistic of a SNP reaching a2 would be ranked above a certain level of quantile-normalized P value in pooling (pq; 14 levels were considered: 0.0025, 0.005, 0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, and 1). Pr(pq|a2) was modeled empirically using an African-American cohort previously individually genotyped and pooled according to high and low BMI (Chiang et al. 2010; Kang et al. 2010). At each level of pq and a2, Pr(pq|a2) was calculated as the number of SNPs with actual P value <a2 and one-tailed quantile-normalized P value<pq divided by all SNPs with actual P value<a2. All SNPs with one-tailed quantile-normalized P value <pq during stage 2 would be followed up in stage 3. When determining the cost of each 3-stage design, individuals in stage 2 were assumed to be pooled in groups of 150 and genotyped in triplicate using the genome-wide genotyping array. Furthermore, in order to maintain the same level of power in the 3-stage design involving pooling, often an increased number of SNPs would reach the specified alpha level under the null. To penalize the increased false positive rate, these SNPs would be individually genotyped in all individuals apportioned to stage 2, and the cost would be added to the overall cost of the design. The design achieving 80% total power with the lowest cost was then deemed the optimized 3-stage design involving pooling.
In evaluating the cost-efficiency involving pooling under scenarios where genotyping costs have decreased, we assumed the per genotype cost using a genome-wide array to be $0.000325 and $0.000163, or the per genotype cost using custom genotyping to be $0.065 or $0.0325. To simulate scenarios with 2× and 3× improved pooling efficiency, we assumed that the same level of Pr(pq|a2) could be achieved with pq that is two times or three times smaller, respectively.
Twenty-one candidate SNPs associated with BMI (18 of the top 36 encouraging SNPs and three of the top seven inconclusive SNPs) were successfully genotyped in the individuals comprising the GxE and SPT pools. Genotyping was performed as described elsewhere (Gabriel et al. 2002), using the Sequenom MassARRAY platform (Tang et al. 1999) with the iPLEX protocol. The basic protocol involves a multiplex primer extension followed by matrix-assisted laser desorption ionization-time of flight mass spectroscopy detection (Tang et al. 1999). SNPs were considered working if they genotyped successfully in 90% or more of the samples and had no more than one consensus error among 32 triplicates in SPT and 34 triplicates in GxE (for a total of 96 and 102 internal comparisons per SNP, respectively). SNPs were evaluated by these parameters in the SPT and GxE cohorts independently. Association tests for candidate SNPs associated with BMI were conducted using PLINK v1.05 (http://pngu.mgh.harvard.edu/purcell/plink/; Purcell et al. 2007).
The principal acknowledgement is to the participants who contributed their time, biological samples, and phenotype data to the different projects. The authors also thank P.I. de Bakker, P. Sklar, and past and present members of the Hirschhorn laboratory for comments, ideas, and discussions; the past and present team members and investigators at the Tropical Medicine Research Institute and the Hawai’i and Los Angeles Multi-Ethnic Cohort for collecting samples and data. This work was supported by a graduate research fellowship from the National Science Foundation to C.W.K.C. and grants from the National Institutes of Health to J.N.H. (R01DK075787), to R.S.C. (R37HL45508 and R01HL53353), to X.Z. (R01HL074166), and to M.R.P. (R01HD048960). The funders had no role in study design, data collection, analysis, interpretation, decision to publish, or preparation of the manuscript.
Charleston W. K. Chiang, Department of Genetics, Harvard Medical School, Boston, MA 02115, USA. Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA. Program in Genomics and Divisions of Genetics and Endocrinology, Children’s Hospital, Boston, MA, USA.
Zofia K. Z. Gajdos, Department of Genetics, Harvard Medical School, Boston, MA 02115, USA. Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA. Program in Genomics and Divisions of Genetics and Endocrinology, Children’s Hospital, Boston, MA, USA.
Joshua M. Korn, Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA. Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA, USA. Department of Molecular Biology, Massachusetts General Hospital, Boston, MA, USA.
Johannah L. Butler, Program in Genomics and Divisions of Genetics and Endocrinology, Children’s Hospital, Boston, MA, USA.
Rachel Hackett, Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
Candace Guiducci, Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
Thutrang T. Nguyen, Program in Genomics and Divisions of Genetics and Endocrinology, Children’s Hospital, Boston, MA, USA.
Rainford Wilks, Epidemiology Research Unit, Tropical Medicine Research Institute, University of the West Indies, Kingston, Jamaica.
Terrence Forrester, Tropical Metabolism Research Unit, Tropical Medicine Research Institute, University of the West Indies, Kingston, Jamaica.
Katherine D. Henderson, Division of Cancer Etiology, Department of Population Sciences, City of Hope National Medical Center, Duarte, CA, USA.
Loic Le Marchand, Epidemiology Program, Cancer Research Center of Hawaii, University of Hawaii, Honolulu, HI, USA.
Brian E. Henderson, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
Christopher A. Haiman, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
Richard S. Cooper, Department of Preventive Medicine and Epidemiology, Stritch School of Medicine, Loyola University Chicago, Maywood, IL, USA.
Helen N. Lyon, Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA. Program in Genomics and Divisions of Genetics and Endocrinology, Children’s Hospital, Boston, MA, USA.
Xiaofeng Zhu, Department of Biostatistics and Epidemiology, Case Western Reserve University, Cleveland, OH, USA.
Colin A. McKenzie, Tropical Metabolism Research Unit, Tropical Medicine Research Institute, University of the West Indies, Kingston, Jamaica.
Mark R. Palmer, Division of Endocrinology, The Hospital for Sick Children, Toronto, ON, Canada. Department of Pediatrics, University of Toronto, Toronto, ON, Canada.
Joel N. Hirschhorn, Department of Genetics, Harvard Medical School, Boston, MA 02115, USA. Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA. Program in Genomics and Divisions of Genetics and Endocrinology, Children’s Hospital, Boston, MA, USA. Children’s Hospital Boston, CLS 16065, 300 Longwood Avenue, Boston, MA 02115, USA.