|Home | About | Journals | Submit | Contact Us | Français|
The insulin-like growth factor (IGF) pathway has been implicated in prostate development and carcinogenesis. We conducted a comprehensive analysis, utilizing a resequencing and tagging single-nucleotide polymorphism (SNP) approach, between common genetic variation in the IGF1, IGF binding protein (BP) 1, and IGFBP3 genes with IGF-I and IGFBP-3 blood levels, and prostate cancer (PCa) risk, among Caucasians in the NCI Breast and Prostate Cancer Cohort Consortium. We genotyped 14 IGF1 SNPs and 16 IGFBP1/IGFBP3 SNPs to capture common [minor allele frequency (MAF) ≥ 5%] variation among Caucasians. For each SNP, we assessed the geometric mean difference in IGF blood levels (N = 5684) across genotypes and the association with PCa risk (6012 PCa cases/6641 controls). We present two-sided statistical tests and correct for multiple comparisons. A non-synonymous IGFBP3 SNP in exon 1, rs2854746 (Gly32Ala), was associated with IGFBP-3 blood levels (Padj = 8.8 × 10−43) after adjusting for the previously established IGFBP3 promoter polymorphism A-202C (rs2854744); IGFBP-3 blood levels were 6.3% higher for each minor allele. For IGF1 SNP rs4764695, the risk estimates among heterozygotes was 1.01 (99% CI: 0.90–1.14) and 1.20 (99% CI: 1.06–1.37) for variant homozygotes with overall PCa risk. The corrected allelic P-value was 8.7 × 10−3. IGF-I levels were significantly associated with PCa risk (Ptrend = 0.02) with a 21% increase of PCa risk when compared with the highest quartile to the lowest quartile. We have identified SNPs significantly associated with IGFBP-3 blood levels, but none of these alter PCa risk; however, a novel IGF1 SNP, not associated with IGF-I blood levels, shows preliminary evidence for association with PCa risk among Caucasians.
The role of the insulin-like growth factor (IGF) pathway has been studied extensively in both normal and transformed cells. Both in vivo (1–3) and in vitro (4–6) studies demonstrate that IGF-I binding to the IGF type 1 receptor modulates cellular proliferation, differentiation and apoptosis—important characteristics in tumorigenesis (7–11). Circulating levels of IGF-I derive mostly from the liver; more than 90% is complexed with IGF-binding protein 3 (IGFBP-3) and an acid-labile subunit thus reducing bioavailability (12,13). However, many types of tissues, including certain neoplasms (10), are capable of producing IGF-I locally. Although the main effect of IGFBP-3 is thought to be inhibition of cell growth and proliferation due to sequestration of the IGF-I ligand, recent research suggests that IGFBP-3 has antiproliferative and proapoptotic effects independent of IGF-I (14,15).
Elevated blood levels of IGF-I have been associated with several cancers, most commonly with prostate cancer (PCa), although later studies have found weaker associations than initially reported (16–23). A recent meta-analysis of 12 prospective studies reported a 38% increased risk of developing PCa when comparing the highest to lowest quartile of IGF-I levels (24). Although circulating IGFBP-3 levels were inversely associated with PCa risk in earlier studies, recent findings have been mostly null (23–27).
Nutrition remains a key determinant of circulating IGF-I levels (28,29), but heritability studies have estimated that the proportion of variance explained by inherited genetic variation ranges from 38 to >80% for IGF-I and IGFBP-3 blood levels, respectively (30–34). The specific genetic variants that contribute to heritable risk are not well defined. Results between an upstream IGF1 repeat sequence (CA)n and IGF-I blood levels have varied (35–38), and most studies, including a recently published meta-analysis (39), have reported a null association (40–44). In a case–control study, Johansson et al. (45) recently reported a marginally significant association (P = 0.02) between an IGF1 haplotype, previously reported associated with PCa risk, and IGF-I blood levels among controls in the Cancer Prostate in Sweden (CAPS) study (46); however, the authors were unable to replicate this haplotype–IGF-I association in a prospective study. In contrast, the significant decreased IGFBP-3 blood levels among carriers of the C allele of the IGFBP3 A-202C promoter polymorphism (rs2854744) has been observed across multiple studies of both men and women (39,41,43,47–54). Additionally, an in vitro transient transfection assay demonstrated that the C allele had 50% lower activity than the A allele (55).
Most previous genetic studies of the IGF1 locus with PCa risk have focused on an upstream IGF1 (CA)n repeat with equivocal results (56–60). Recently, two investigations comprehensively examined IGF1 genetic variation with PCa risk selecting single-nucleotide polymorphisms (SNPs) by public databases (i.e. HapMap), exonic resequencing or both. In the Multiethnic Cohort (MEC) study, Cheng et al. (61) identified an association with an upstream IGF1 SNP (rs7965399) and PCa, whereas in the CAPS study, Johansson et al. (45) used public databases to report a significant increased risk of PCa for single-copy carriers of an IGF1 haplotype spanning intron 2 through the 3′ UTR. The relationship between common genetic variation in the IGFBPs and PCa risk has only been thoroughly examined by Cheng et al. (62) in the MEC; no association was found between IGFBP1 and IGFBP3 polymorphisms with PCa risk.
We conducted a comprehensive haplotype tag-SNP analysis of the common genetic variation in IGF1, IGFBP1 and IGFBP3 in relation to IGF-I and IGFBP-3 blood levels and PCa risk among Caucasians in the NCI Breast and Prostate Cancer Cohort Consortium (BPC3), a pooled nested case–control study from seven cohorts (63). The large sample size of the BPC3 having 6012 prospective PCa cases and 6641 controls allows us to detect modest genetic effects and to assess effect modification. In addition, we are able to examine risk in clinically important subgroups of PCa defined by stage and Gleason score at diagnosis.
Characteristics of the studies within the BPC3 are presented in Table 1. Briefly, the cases and controls were comparable across cohorts with respect to demographic and other potentially PCa-related factors, with the exception of height where the ATBC Study (Finnish population) and EPIC cohorts (eight European countries) are shorter in stature. Family history was not available for the EPIC and PHS cohorts. PCa clinical information such as stage (63%) and Gleason score (56%) was available for more than half of the cases.
The IGF1 locus was characterized by four haplotype blocks (Fig. 1, bottom panel). The IGFBP1 and IGFBP3 loci are near each other, separated by 19-kb, and were characterized by three haplotype blocks defined by 12 htSNPs (Fig. 2, bottom panel). We included four additional IGFBP3 SNPs. The average genotyping success rate across cohorts was 94.7% (ranging from 84.4 to 99.8%). No deviation from Hardy–Weinberg equilibrium was observed for any SNP (at the P < 0.01 level) among controls across more than one study. The controls minor allele frequencies were similar across the cohorts, albeit the ATBC (Finnish) cohort differed slightly for a few markers (IGF1: Supplementary Material, Table S1; IGFBP1 and IGFBP3: Supplementary Material, Table S2).
Individual cohort and pooled values of log-transformed IGF-I blood levels were similar as were the case and control levels within each cohort (Supplementary Material, Fig. S1). In a pooled analysis, 3 of 14 htSNPs were nominally associated with IGF-I blood levels (Fig. 1, green triangles): rs35767 (Block 1; Puncorr = 7.9 × 10−3), rs12821878 (Block 2; Puncorr = 1.0 × 10−3) and rs1549593 (Block 3; Puncorr = 3.3 × 10−3). As shown in Figure 1, the pair-wise linkage disequilibrium (LD) for these three markers is negligible except between rs12821878 and rs1549593 (r2 = 0.27 among PLCO controls). The geometric means and 95% CIs by SNP and haplotypes are presented in Supplementary Material, Tables S3 and S4. After correcting for multiple comparisons, none of the IGF1 marker's associations remained statistically significant (Pcorr > 0.07) with IGF-I blood levels.
The distributions of IGFBP-3 blood levels were not as uniform as IGF-I blood levels across the cohorts (Supplementary Material, Fig. S2). Within each cohort, case and control blood levels of IGFBP-3 were very similar. The geometric means and 95% CIs by SNP and haplotypes are presented in Supplementary Material, Tables S5 and S6.
Among the 12 htSNPs and 4 candidate SNPs in IGFBP1 and IGFBP3, six were significantly associated with blood levels before and after adjusting for multiple comparisons (Fig. 2, green triangles). For the most strongly associated htSNP, rs2854746 (Pcorr = 8.8 × 10−43), the mean IGFBP-3 blood level was 3046 ng/ml for wild-type homozygotes, 3263 ng/ml for heterozygotes and 3442 ng/ml for variant homozygotes. The extensively studied IGFBP3 promoter polymorphism rs2854744 (A-202C) was also strongly associated with IGFBP-3 blood levels in the univariate analysis (Pcorr = 8.1 × 10−34). However, after simultaneously including all six significant IGFBP3 SNPs in a multi-SNP linear regression, only rs2854746 remained statistically significant overall (Puncorr = 1.4 × 10−10) and among controls (Puncorr = 4.9 × 10−8), while rs2854744 (A-202C) was no longer associated with IGFBP-3 blood levels (P = 0.91). Stratified analysis of the IGFBP3 htSNPs rs2854746 and rs2854744 (A-202C) showed a consistent result such that across the three genotypes for rs2854746, the mean IGFBP-3 blood levels remained unchanged by rs2854744 (A-202C) genotypes (Table 2). However, the differences in mean IGFBP-3 blood levels were statistically significant across rs2854746 in the heterozygote (CA) and variant homozygote (AA) strata defined by rs2854744 (A-202C) genotypes (PCA = 2.7 × 10−6; PAA = 4.1 × 10−6).
The associations between the 14 IGF1 htSNPs and overall PCa risk among Caucasians are presented in the upper panel of Figure 1 (red circles). The tests for heterogeneity (Supplementary Material, Table S7) across cohorts were not statistically significant, thus we present the pooled results for the main effect analyses. We found nominally significant associations between IGF1 SNPs rs2373722 (Puncorr = 2.0 × 10−3) and rs4764695 (Puncorr = 1.2 × 10−4) with overall PCa risk (Supplementary Material, Table S8). After controlling for multiple comparisons, the corrected P-value for rs2373722 was no longer significant (Pcorr = 0.14), whereas rs4764695 remained statistically significant (Pcorr = 8.7 × 10−3). For rs4764695, the overall risk estimate was 1.01 (99% CI: 0.90–1.14) for heterozygotes and 1.20 (99% CI: 1.06–1.37) for variant homozygotes, with consistent point estimates for heterozygotes and variant homozygotes across the cohorts (Fig. 3). The results remained unchanged when previously published data from the MEC were excluded from the analysis (Puncorr = 2.8 × 10−4; Supplementary Material, Table S8). IGF1 marker rs4764695 was not significantly associated with IGF-I blood levels. Testing for effect modification by several variables of interest [family history of PCa, age at diagnosis, body mass index (BMI) and height] revealed no statistically significant heterogeneity in any of the subgroup analyses for the 14 htSNPs (data not shown).
We examined the risk association of each SNP stratified by stage at diagnosis (high stage: C or D; low stage: A or B) and Gleason score at diagnosis (high grade: ≥8; low grade: <8) when compared with controls (data not shown). Among the 14 IGF1 SNPs, rs4764695 was the only marker remaining statistically significant (Puncorr < 0.01) across all of the strata except for stage (Table 3). The risk estimate was slightly greater for high-grade cancer (Gleason score ≥8, OR = 1.43) than low-grade cancer (Gleason score 2–7, OR = 1.19) for variant homozygotes, but the test of heterogeneity was only marginally statistically significant (P = 0.065).
The cohort-specific haplotype frequencies for IGF1 among Caucasians in the MEC panel and other six cohorts are shown in Supplementary Material, Table S9, and the pooled and cohort-specific associations of these haplotypes and overall PCa risk are presented in Supplementary Material, Table S10. The tests for heterogeneity by cohort across the haplotype blocks were not statistically significant. We observed statistically significant global P-values for IGF1 block 3 (Puncorr = 0.002) and block 4 (Puncorr = 0.004). Since SNP rs2373722 resides within block 3 and rs4764695 resides within block 4, we removed both SNPs from their respective blocks and found that neither block remained statistically significant with PCa risk (data not shown), suggesting that the haplotype analysis did not provide any additional insight beyond the SNP analysis. None of the haplotype analyses were significant for effect modification by family history, age at diagnosis, BMI, height or PCa diagnostic variables (data not shown).
Figure 2 shows the results for overall PCa risk for the 12 htSNPs and 4 additional IGFBP1 and IGFBP3 SNPs (red circles). None of the 16 SNPs were nominally associated with PCa risk at the Puncorr < 0.01 level. The IGFBP3 markers significantly associated with blood levels were not associated with PCa risk. The cohort-specific results are presented in Supplementary Material, Table S11, and the IGFBP1 and IGFBP3 SNP analyses with and without the previously reported MEC samples are available in Supplementary Material, Table S12. None of the tests for heterogeneity (by cohort) or effect modification (family history, age at diagnosis, BMI and height in tertiles and quartiles) were statistically significant in any of the subgroup analyses (data not shown).
The haplotype frequencies within the MEC panel and by cohort are presented in Supplementary Material, Table S13. Similar to the IGFBP1 and IGFBP3 SNP analyses, neither the global tests for haplotype blocks nor the test for any individual haplotype were statistically significant (Supplementary Material, Table S14) in the pooled or cohort-specific analyses.
Substantial epidemiologic and experimental evidence exist implicating the IGF pathway in prostate carcinogenesis. Although many studies have demonstrated an increased risk of PCa, especially advanced disease among men with high IGF-I blood levels (16–18,27), comprehensive genetic analyses of the IGF pathway with PCa risk are limited. In this large consortium (n = 6012 PCa cases) from seven prospective studies, we conducted a thorough analysis of common genetic variation of three primary loci in the IGF pathway (IGF1, IGFBP1 and IGFBP3) and observed among Caucasians an association between an IGF1 SNP (rs4764695, MAF = 0.49, ~34-kb downstream of IGF1 exon 4) and PCa risk following multiple testing corrections. Men carrying the homozygous variant had a 20% higher risk of developing PCa (OR = 1.20; 99% CI: 1.06–1.37; Padj = 8.69 × 10−3) compared to those with the wild-type controlling for age in 5-year intervals, cohort and country of residence for EPIC.
The IGF1 htSNPs presented here are a subset of the IGF1 htSNPs reported in a study by the MEC. The MEC authors did not find a significant association between PCa risk and rs4764695 (P > 0.30) but they had limited power among whites (23%) for this marker (61). They reported a marginally significant association between PCa and heterozygotes for an IGF1 htSNP (rs7965399) located in the 5′ region (ORhet = 1.25; 95% CI 1.09–1.43), whereas we observed a null association between rs7965399 and PCa with (P = 0.334) and without (P = 0.689) the MEC Caucasian participants. The majority of the MEC-nested PCa case–control study, however, is non-Caucasian (African American = 28%, Latino = 28%, Japanese = 20%, native Hawaiian = 3%) and the different risk associations among overlapping IGF1 htSNPs suggest either racial/ethnic differences within this region, the causative SNP remains unknown or false-positive findings. The IGF1 htSNP results for African Americans presented by Cheng et al. remained unchanged when we included an additional 105 African American PCa cases from PLCO (data not shown).
The Swedish CAPS case–control study (N = 2863 PCa cases, all Caucasian) also evaluated IGF1 genetic variation with PCa by genotyping SNPs identified in the HapMap Phase I data (45). A marginally significant association was reported for carriers of one copy of a haplotype spanning from intron 2 to the 3′ UTR of IGF1 with PCa risk (P = 0.02 adjusting for multiple testing) corresponding to a similar region covered by our IGF1 haplotype block 3. Although the three SNPs in the CAPS haplotype (rs2033178, rs7136446 and rs6220) were not included in our IGF1 htSNP panel, our IGF1 htSNP rs2373722 is in perfect LD (r2 = 1.00 in HapMap CEPH) with the CAPS SNP rs2033178 (45). Our block 3 haplotype association was mainly driven by the htSNP rs2373722 which was null after correcting for multiple testing (Padj = 0.14). Neither the SNP rs474695 identified in our study nor any equivalent proxy was included in the CAPS study.
Although IGF1 SNP rs4764695 was significantly associated with prostate cancer risk and the main effect was slightly greater for high tumor grade (high-grade OR = 1.43; low-grade OR = 1.20; Phet = 0.065), the association with IGF-I blood levels was null. Among the six BPC3 studies with IGF blood levels in this report, elevated IGF-I levels were significantly associated with higher PCa risk (Ptrend = 0.02); the comparison between the highest IGF-I quartile with the lowest quartile yielded a 21% increased risk of PCa (P = 0.02). In contrast, the association between IGFBP-3 blood levels and PCa was null (Ptrend = 0.89). The IGF-I finding in the BPC3 is consistent with results of a recent meta-analysis of 42 studies (OR = 1.21; 95% CI 1.07–1.36) (64). In the same meta-analysis, data from 29 studies showed a significant inverse association for IGFBP-3 (OR 0.88; 95% CI 0.79–0.98) with substantial heterogeneity; the inverse association of IGFBP-3 with PCa risk was seen in retrospective studies, but not prospective studies (64).
Several reasons may explain the lack of association between rs4764695 and IGF-I blood levels. First, the IGF-I measurements reflect systemic levels measured at a single time point prior to PCa diagnosis rather than tissue-specific levels. Free IGF-I blood levels, unavailable across these studies, may be a more biologically relevant measure and impacted by rs4764695. Furthermore, the complexity of the IGF pathway is likely not entirely captured in these simple associations, and a more complete pathway analysis is warranted. As the rs4764695 marker lies ~34 kb downstream of IGF1, this marker may be involved in another pathway entirely. In addition, measurement error for both the genotypes and IGF-I blood levels would lead to non-differential misclassification and a bias towards the null. Lastly, the result for rs4764695 may be spurious, although we have taken steps by setting a stringent alpha level (0.01) and correcting for multiple comparisons.
We identified several IGFBP3 SNPs strongly associated with IGFBP-3 blood levels among Caucasians but not associated with PCa risk. Given a considerable amount of LD exists within the 5′ region (Fig. 2) of IGFBP3, we tested the independent effect of six correlated SNPs using a multi-marker model and determined that the most significant SNP was rs2854746, a non-synonymous polymorphism in exon 1 (Gly32Ala). This is in contrast to the promoter polymorphism rs2854744 (A-202C) that has been extensively reported to be associated with IGFBP-3 levels (39,41,43,47–54). This observation has been alluded to in two previous reports (50,53), but could not be substantiated due to the limited sample sizes and strong LD between these two markers among Caucasians (r2 = 0.85 among Caucasian PLCO controls). Although each minor allele was associated with 6.3% higher IGFBP-3 blood levels on average, rs2854746 explains only 3.6% of the variation. The lack of an association between rs2854746 and PCa risk, whereas a strong influence on IGFBP3 blood levels exists, supports a Mendelian randomization argument for no etiological effect of IGFBP-3 on incident risk of PCa (65,66). However, caution is needed because confounding or pleiotropic effects would negate this argument (67).
The major strength of this study is the utilization of a large cohort consortium and a comprehensive approach to examine the genetic variation across three genes in the IGF pathway, a strong candidate in prostate carcinogenesis. Specifically, we have the ability to look at the effects of SNPs on pre-diagnostic blood levels as well as risk in the same set of subjects. Although our data in Caucasians limits the generalizability to other ethnic/racial groups, our large sample size allows us to present the overall risk estimates using a 99% CI, reducing the chance of both false-positive and -negative results. In addition, we reduced the probability of a spurious association due to multiple hypothesis testing by applying a correction across all models, traits and genetic markers. The IGF pathway is a complex system and we have limited our study to three primary genes. Additional IGF pathway genes need to be investigated and a more comprehensive pathway analysis would be necessary.
While several genome-wide association scans (GWASs) have recently identified multiple susceptibility loci for PCa risk (68–76), only recently has an IGF variant (IGF2-rs7127900) been among them. The Cancer Genetic Markers of Susceptibility (CGEMS) GWAS is the only publicly available database to compare our IGF1 rs4764695 finding. Although rs4764695 was not present on the CGEMS platform, a proxy rs1980236 showed a similar effect, although not statistically significant (P = 0.57). However, the PLCO study was the first stage of the CGEMS GWAS and more than 70% of this sample is included in this analysis where the observed effect for rs4764695 was 1.11 (P = 0.15). The initial stages of the GWAS studies include less than 2000 PCa cases and have limited power, 67%, to detect an OR of 1.20 for an MAF of 0.50. In contrast, our large consortium study allows for focused and comprehensive evaluation of candidate genes among more than 6000 PCa cases, having the necessary power, 99%, to identify an OR of 1.20 for an MAF of 0.50. This is evidenced by mulitple GWAS studies identifying common as well as different risk loci. Currently, only 6% of PCa genetic variation is explained by the known loci identified in GWAS studies (77). The missing heritability, aka ‘dark matter’, may reside undetected in the ‘lower Manhattan’ plots and represented by multiple variants (78,79). For example, CGEMS replicated several variants, JAZF1 (GWAS rank 24 407; P = 0.04) and MSMB (GWAS rank 24 223; P = 0.04), with little evidence from the first stage. Furthermore, both in vivo (1–3) and in vitro (4–6) studies have demonstrated that the IGF pathway plays a role in tumorigenesis, thus making it a strong biological candidate.
In conclusion, a significant association between prostate cancer risk and an IGF1 SNP, rs4764695, was identified among Caucasians. Although this is a novel finding, the evidence is still preliminary and further confirmation is needed. The estimated population-attributable risk for homozygotic variant carriers is ~5% due to the high frequency of the minor allele (G = 49%). This variant could be of greater importance due to the potential for a stronger association with high-tumor-grade PCa. The association between rs4764695 and cancer is exclusive with prostate cancer as the association with breast cancer risk and this marker has been reported as null in the NCI BPC3 study (54). Furthermore, we provide strong evidence for a novel association between IGFBP-3 blood levels and a non-synonymous IGFBP3 marker in contrast to the previously reported IGFBP3 promoter polymorphism. Additional studies, such as fine mapping to determine the causal variant in IGF1 and the examination of additional genes in the IGF axis, are needed. In summary, preliminary evidence implicates common genetic variation in the IGF1 locus with PCa risk.
The BPC3 and member cohorts have been described in detail elsewhere (80). In brief, the consortium combines resources from seven well-established cohort studies: the American Cancer Society Cancer Prevention Study II (CPS-II) (81), the Alpha-Tocopherol, Beta-Carotene Cancer Prevention (ATBC) Study (82), the European Prospective Investigation into Cancer and Nutrition Cohort (EPIC—comprised of cohorts from Denmark, Great Britain, Germany, Greece, Italy, the Netherlands, Spain, and Sweden) (83), the Health Professionals Follow-up Study (HPFS) (84), the Hawaii/Los Angeles MEC (85), the Physicians' Health Study (PHS) (16), and the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial (PLCO) (86). These cohorts collectively include over 248 000 men with a blood sample.
The current study was restricted to individuals who self-reported as Caucasian and consists of 6012 prospective PCa cases and 6641 controls. Cases from other ethnic groups were contributed mostly from the MEC and had been reported on previously (53,61,62); we analyzed the data for Caucasians with and without the contribution from the MEC (457 cases and 452 controls) to assess the impact on the overall results. Prospective PCa cases were identified through population-based cancer registries or self-reports confirmed by medical records, including pathology reports. The BPC3 data for PCa consists of a series of matched nested case–control studies within each cohort; controls were matched to cases on a number of potential confounding factors, including age, country and region of recruitment. For the current analysis, PCa cases were matched to available controls by age in 5-year intervals, cohort and country of residence for EPIC. A written informed consent was obtained from all subjects, and each study was approved by the Institutional Review Boards at their respective institutions.
Pre-diagnostic measurements of IGF-I and IGFBP-3 were available for six of the seven BPC3 cohort members (ATBC, EPIC, HPFS, MEC, PHS and PLCO; IGF-I: N = 6076; IGFBP-3: N = 6059) (16,19–23,53). Most blood samples in the CPS-II were collected post-diagnosis and therefore were only included in the genotyping analyses. Samples from three of the studies (ATBC, HPFS and PHS) were measured in the Pollak laboratory and the remaining three studies (EPIC, MEC and PLCO) were measured in the laboratory of the Hormones and Cancer Team at IARC; all used enzyme-linked immunosorbent assays (Diagnostic System Laboratories, Webster, TX, USA).
As previously described by Cheng et al. (61,62), a multi-stage approach was used to characterize genetic variation across IGF1, IGFBP1 and IGFBP3 loci. Most of the exons across the three genes were resequenced in 95 advanced PCa cases, and a multiethnic panel of 349 controls was utilized to determine the patterns of LD encompassing ~20 kilobases (kb) upstream and ~10 kb downstream of each gene. Haplotype-tagging SNPs (htSNPs) for each haplotype block, determined by the confidence interval method of Gabriel et al. (87,88), were chosen based on , a measure of the correlation between observed and predicted haplotypes based on the htSNP genotypes (89), to select a minimum set of SNPs that would achieve an for all common haplotypes with an estimated frequency of ≥5% among Caucasians.
For genetic characterization of IGF1 (chromosome 12q22-q23), 154 SNPs were evaluated over a 156-kb region (one SNP for every 2.4 kb) in a multiethnic panel of subjects with no history of cancer (61). After removing markers that were monomorphic or had poor genotyping results, a panel of 64 SNPs remained, from which 14 htSNPs were selected to predict the common haplotypes among Caucasians (). Of the 14 htSNPs, 11 SNPs are available in HapMap Phase II CEPH samples and capture 60% of the common genetic variation (MAF > 5%) of IGF1 with an r2 > 0.70.
For genetic characterization of IGFBP1 and IGFBP3 (which are located contiguously in a 35-kb region on chromosome 7p13-p12), 56 SNPs over a 71-kb region were evaluated (one SNP for every 2 kb) in the multiethnic panel (61,62). Twenty markers were removed due to being monomorphic or having poor genotyping results. The final selection included 12 htSNPs to predict the common haplotypes among Caucasians (), 2 genic SNPs in IGFBP3 not part of a haplotype block (rs6670, rs2453839) and 2 additional IGFBP3 SNPs (rs2132570, rs2960436). Ten of these 16 SNPs are available in HapMap Phase II CEPH samples and capture 41% of the common genetic variation (MAF > 5%) of IGFBP1 and IGFBP3 with an r2 > 0.70.
Genotyping was conducted by five laboratories (University of Southern California, Los Angeles, CA, USA; University of Hawaii, Honolulu, HI, USA; Harvard School of Public Health, Boston, MA, USA; Core Genotyping Facility, National Cancer Institute, Bethesda, MD, USA; and Cambridge University, Cambridge, UK) using a fluorescent 5′ endonuclease assay and the ABI-PRISM 7900 for sequence detection (TaqMan; Applied Biosystems, Inc.). Assay information is available at the MEC Genetics Web site (http://uscnorris.com/mecgenetics/CohortGCKView.aspx). For each assay, the concordance rate was 100% for 102 samples from the SNP500 Cancer project (http://snp500cancer.nci.nih.gov) (90) and inter-laboratory completion and concordance rates were greater than 99%, based on cross-laboratory assessment of 30 SNPs on 94 samples from the Coriell Biorepository (Camden, NJ, USA). The internal quality of genotype data at each genotyping center was assessed by typing 5–10% blinded samples in duplicate or greater (depending on study).
All statistical tests presented are two-sided and were conducted in SAS 9.0 (SAS Institute). Figures and multiple testing corrections were generated in the statistical program R (http://cran.r-project.org/). To account for multiple hypothesis testing, we applied the method implemented in PACT, a flexible and efficient approach that adjusts for correlation between multiple traits, genetic markers and models (91). PACT utilizes less computational time while maintaining the accuracy of permutation or simulation-based tests. As multiple models and traits can be considered, the P-value corrections were computed simultaneously for blood levels and PCa risk across all 30 IGF1 and IGFBP1/IGFBP3 markers. A test of significance was set at the 0.01 level to minimize the chance of both false-positive and -negative results (92,93). We present the uncorrected P-values (Puncorr) for association in the figures and tables. A corrected P-value (Pcorr) is presented for significant SNPs correcting for multiple comparisons across all traits, genetic markers and statistical models among the 30 IGF genetic markers analyzed.
We identified cohort- and assay batch-specific statistical outliers based on the generalized extreme studentized deviate many-outlier detection approach (94) setting alpha to 0.05 for both IGF-I and IGFBP-3 blood levels. Based on this, we excluded 23 IGF-I samples (n = 14 cases and 9 controls) and 43 IGFBP-3 samples (n = 19 cases and 24 controls). The IGF-I and IGFBP-3 blood levels were log-transformed to provide approximate normal distributions. The geometric mean and 95% CI according to haplotypes or SNPs were calculated using linear regression analysis adjusting for age at blood draw, assay batch, cohort (including country for EPIC) and case–control status. Haplotype frequencies and subject-specific expected haplotype indicators were calculated among cases and controls combined by cohort (and country within EPIC) and then combined for the haplotype analyses. A global haplotype test was performed for each haplotype block by using an F-test to compare the sum of the squared residuals for a full model (all haplotypes within a block) and a nested model (without haplotypes within a block). A multi-SNP model was utilized by including all statistically significant SNPs identified from the univariate analysis to assess independent SNP effects within a gene.
The statistical methods used have been described previously (92,95). In brief, we used conditional logistic regression to estimate ORs and 99% CI for disease associated with genetic markers (SNP or haplotype). The matching factors in the conditional logistic regression were age (in 5-year intervals), cohort and country within EPIC. We estimated the genotypic ORs for disease by using the most common genotype as the referent group for the SNP analyses. We estimated haplotype-specific ORs using an expectation-substitution approach to account for haplotype uncertainty given unphased genotype data (96,97). To test the global null hypothesis of no association between IGF genetic variation and risk of PCa, we used a likelihood ratio test (LRT) comparing a model with additive effects for each common haplotype (treating the most common haplotype as the referent) to the intercept-only model. We considered haplotypes with greater than 5% frequency in at least one cohort to be ‘common’. All other haplotypes were excluded.
When assessing genetic effects (SNPs or haplotype), we tested for heterogeneity across cohort and several potential effect modifiers. We used an LRT by including an interaction term between the genetic effect (SNP or haplotype) and variable of interest in comparison to the null model. The tests for heterogeneity by cohort were not statistically significant (P > 0.01), and therefore we present the pooled results. We assessed for the presence of effect modification by family history (at least one first-degree relative or two second-degree relatives diagnosed with PCa), age at diagnosis (<65, ≥65), BMI at baseline (<25, 25–<30, >30) and height (tertiles and quartiles, cohort-specific cut-points among controls were used). Finally, we evaluated the association of gene variants with advanced stage PCa (stage C or D) and high-grade PCa (Gleason score ≥8 or poorly differentiated).
This work was funded by National Cancer Institute (NCI) grants U01 CA098216 (EPIC), U01 CA098233 (Harvard), U01 CA098758 (MEC) and U01 CA098710 (ACS). F.R.S. was supported by the National Research Service Award Training Program in Cancer Epidemiology (T32 CA009001-32). The Physicians' Health Study was supported by the NCI (CA-34944, CA-40360 and CA-097193) and the NHLBI (HL-26490 and HL-34595). The Multiethnic Cohort was supported by CA54281 and CA63464. The ATBC Study was supported by US Public Health Service contracts N01-CN-45165, N01-RC-45035 and N01-RC-37004 from the Department of Health and Human Services, National Cancer Institute.
We thank Hardeep Ranu and Patrice Soule of the Dana-Farber/Harvard Cancer Center High Throughput Polymorphism Detection Core.
Conflict of Interest statement. None declared.