|Home | About | Journals | Submit | Contact Us | Français|
SNPs at 11q13 were recently implicated in prostate cancer risk by two genome-wide association studies and were consistently replicated in multiple study populations. To explore prostate cancer association in the regions flanking these SNPs, we genotyped 31 tagging SNPs in a ~110 kb region at 11q13 in a Swedish case-control study (CAPS), including 2,899 cases and 1,722 controls. We found evidence of prostate cancer association for the previously implicated SNPs including rs10896449, which we termed locus 1. In addition, multiple SNPs on the centromeric side of the region, including rs12418451, were also significantly associated with prostate cancer risk (termed locus 2). The two groups of SNPs were separated by a recombination hotspot. We then evaluated these two representative SNPs in an additional ~4,000 cases and ~3,000 controls from three study populations and confirmed both loci at 11q13. In the combined allelic test of all four populations, P = 4.0 × 10−11 for rs10896449 at locus 1, and P = 1.2 × 10−6 for rs12418451 at locus 2, and both remained significant after adjusting for the other locus and study population. The prostate cancer association at these two 11q13 loci was unlikely confounded by PSA detection bias because neither SNP was associated with PSA levels in controls. Unlike locus 1 where no known gene is located, several putative mRNAs are in close proximity to locus 2. Additional confirmation studies at locus 2 and functional studies for both loci are needed to advance our knowledge on the etiology of prostate cancer.
A prostate cancer risk associated locus at 11q13 was identified from two genome-wide association studies (GWAS) (1–2). In a National Cancer Institute Cancer Genetic Markers of Susceptibility (CGEMS) GWAS, a SNP rs10896449 at 11q13 (68,751,243) was found to be significantly associated with prostate cancer risk (P = 1.8 × 10−9) among a total of 4,053 cases and 5,121 controls from five independent study populations (1). Similarly, among a total of 5,122 cases and 5,260 controls from three independent study populations in a GWAS from the UK and Australia, rs7931342, a SNP that is only 170 bp away from rs10896449, was significantly associated with prostate cancer risk (P = 1.7 × 10−12) (2). The association between rs7931342 and prostate cancer risk was later confirmed among 13 additional study populations (7,370 cases and 5,742 controls) in the PRACTICAL consortium, P = 3 × 10−11 (3). The prostate cancer association at 11q13 was also confirmed by our group in the CAPS, a Swedish population based case-control study (4). Both SNPs were significantly associated with prostate cancer risk; P = 0.002 for rs10896449 and P = 0.004 for rs7931342. SNPs rs10896449 and rs7931342 were in strong linkage disequilibrium (LD) among control subjects, with a pair-wise r2 of 0.95. To date, evidence for association between prostate cancer risk and variants at 11q13 has been consistently found in all of the published studies, although there are no known genes within a 100 kb genomic region flanking these two SNPs.
To assess whether there are additional independent prostate cancer risk variants in the flanking regions, we performed a fine mapping analysis of the 11q13 genomic region. This strategy was motivated by empirical examples of prostate cancer associations at 8q24 (5–10) and 17q12 (11–12) where additional independent loci were subsequently discovered in the flanking regions at both loci. We describe results of the fine mapping study in our Swedish study population and confirmation efforts in three additional study populations.
Four study populations were included in this study. The first was a population-based prostate cancer case-control study in Sweden named CAncer of the Prostate in Sweden (CAPS) that was used for the fine mapping study (13) (Supplementary Table 1). Prostate cancer patients in CAPS were identified and recruited from regional cancer registries in Sweden. The inclusion criterion for case subjects was pathological or cytological verified adenocarcinoma of the prostate, diagnosed between July, 2001 and October, 2003. DNA samples from blood and TNM stage, Gleason grade (biopsy), and PSA levels at diagnosis were available for 2,899 patients. Patients who met any of the following criteria were considered as having more aggressive disease: clinical stage T3/T4, N+, M+, Differential Grade III, Gleason Score ≥ 8, or pre-operative serum PSA ≥ 50 ng/mL. Control subjects were recruited concurrently with case subjects. They were randomly selected from the Swedish Population Registry, and matched according to the expected age distribution of cases (groups of five-year intervals) and geographical region. DNA samples from blood were available for 1,722 control subjects.
The second study population was a hospital-based case-control population at The Johns Hopkins Hospital (JHH) (10) (Supplementary Table 2). Prostate cancer cases were 1,527 men of European descent (by self report) who underwent radical prostatectomy for treatment of prostate cancer at JHH from January 1, 1999, through December 31, 2006. Each tumor was graded using the Gleason scoring system (14) and staged using the TMN (tumor–node–metastasis) system (15). Patients who met any of the following criteria were considered as having more aggressive disease: pathologic Gleason score of 7 or higher, stage pT3 or higher, N+, or M1. Men undergoing screening for prostate cancer at JHH and in the Baltimore metropolitan area during the same time period were asked to participate as control subjects. A total of 482 men of European descent (by self report) met our inclusion criteria as control subjects for this study: normal DRE, PSA levels ≤ 4.0 ng/mL, and older than 55 years.
The third study population was selected from the American Cancer Society Cancer Prevention Study-II (CPS-II) Nutrition Cohort, a prospective study of cancer incidence (16). Approximately 184,000 US adults between the ages of 50 and 74 were enrolled in 1992 and sent follow-up questionnaires in 1997 and every two years afterwards. We identified 1,414 Caucasian men who had been diagnosed with prostate cancer between 1992 and 2003 and had no previous history of cancer. Cancer status was verified through medical records, linkage with state cancer registries, or death certificates. An equal number of controls were matched to the cases on age (±6 months), race and date of blood collection (±6 months) from men who were cancer-free at the time of cancer diagnosis of their matched case using risk set sampling.
The last study population was 1,172 prostate cancer case subjects and 1,157 control subjects who were selected from the Prostate, Lung, Colon and Ovarian (PLCO) Cancer Screening Trial. This was the study population used for the first stage CGEMS prostate cancer GWAS (1). Individual genotype data were obtained through an approved data request application. For both CPS-II and PLCO, patients with Gleason Score ≥ 7 or Stage ≥ III were considered as aggressive prostate cancer cases.
We identified a ~110 kb region of interest for the fine mapping study (68,665,000–68,775,000, Build 35) based on the CGEMS GWAS results at 11q13 where multiple SNPs in the region had P < 0.05 and two recombination hotspots at the boundaries of the region. A total of 31 tagging SNPs were identified to capture (r2 > 0.8) all the SNPs with minor allele frequencies of 1% or higher in the region of interest based on the HapMap Phase II data. These SNPs were selected for genotyping in CAPS. Polymerase chain reaction (PCR) and extension primers for these SNPs were designed using MassARRAY Assay Design 3.0 software. PCR and extension reactions were performed according to the manufacturer’s instructions, and extension product sizes were determined by mass spectrometry using the Sequenom iPLEX system. Duplicate and water samples, to which the technician was blinded, were included in each 96-well plate as PCR negative controls. The average genotype call rate for these SNPs was > 98% and the average concordance rate was 99.7% among 100 duplicate quality control samples.
A Hardy-Weinberg equilibrium test was performed using the Fisher’s exact test. Haplotype blocks were estimated using the Haploview (17) computer program, and a default Gabriel method (18) was used to define each haplotype block; i.e. a region in which all (or nearly all) pairs of markers are in “strong LD”, which is consistent with no historical recombination. SequenceLDhot was used to determine recombination hotspots (19). SequenceLDhot considers a grid of putative hotspot positions, and for each putative hotspot calculates a Likelihood Ratio (LR) statistic for the presence of a hotspot. Haplotype and background recombination rates generated from PHASE (version 2.1) were used as input files. We assumed the putative hotspots have a width of 2 kb and the program searches for a new hotspot every 1 kb.
We imputed all of the known SNPs in the genome based on the genotyped SNPs and haplotype information in the HapMap Phase II data (CEU) using a computer program, IMPUTE (20). A posterior probability of 0.9 was used as a threshold to call genotypes.
Allele frequency differences between case patients and control subjects were tested for each SNP, using a chi-square test with 1 degree of freedom. The allelic odds ratio (OR) and 95% confidence interval (95% CI) were estimated based on a multiplicative model. Results from multiple case-control populations were combined using a Mantel-Haenszel model in which the populations were allowed to have different population frequencies for alleles but were assumed to have a common OR. The homogeneity of ORs among different study populations was tested using a Breslow-Day chi-square test and I2 method statistics by Higgins and Thompson (21). Independence of prostate cancer associations with SNPs at two loci at 11q13 was tested by including both SNPs (assuming an additive model at each SNP) in a logistic regression model among four populations and adjusted for study population and age. Multiplicative interactions between two SNPs were tested by including both SNPs (assuming a general model) and an interaction term (product of two main effects) in a logistic regression model.
We tested the association of rs7931342 and rs10896449 with PSA levels in controls assuming a 2-df general model and adjusting for age using a multiple regression analysis. PSA levels were logarithm-transformed to best approximate the assumption of normality.
We genotyped 31 tagging SNPs in a ~110 kb region of interest at 11q13 in the CAPS study population. The genotype distributions for all 31 SNPs were consistent with Hardy-Weinberg expectations in control subjects (P > 0.05). We also imputed 53 SNPs (call rate > 90%) in the region based on the genotyped SNPs using the computer program IMPUTE (20). Allele frequency differences between cases and controls in CAPS were tested for these 84 SNPs using a chi-square test (Supplementary Table 3). Multiple SNPs in this 110 kb region were significantly associated with prostate cancer risk in allelic tests (Fig 1a, blue diamond). Specifically, many SNPs within a 37 kb region (68,731,000-68,768,000) flanking rs10896449 (68,751,243) and rs7931342 (68,751,073) were highly significant, and had P-values similar to that of these two SNPs. However, none of these SNPs was significant after adjusting for rs10896449, suggesting they are dependent (Fig 1a, pink diamond). These SNPs can be grouped into locus 1 and were in two consecutive haplotype blocks (Fig 1b).
Several SNPs (68,722,000-68,731) that were immediately centromeric to the haplotypes block were not significantly associated with prostate cancer risk. Nevertheless, multiple SNPs further centromeric were found to be associated with prostate cancer risk. More importantly, most of these SNPs remained significant after adjusting for rs10896449 (Fig 1a, pink diamond), suggesting they are independent from rs10896449 at locus 1. They spanned four consecutive haplotype blocks and can be grouped into locus 2. We estimated the recombination rate across the region among control subjects using SequenceLDhot software (19) and found strong evidence for a recombination hotspot between the two loci at 68,720,000-68,730,000 (P = 1.24 × 10−15) (Fig 1c). The recombination hotspot is also reported in the HapMap data (68,720,001–68,728,001, Release 21, Phase I & II). This recombination hotspot separates these two prostate cancer loci at 11q13. Across the entire 110 kb region of interest at 11q13, rs12418451 (68,691,995) at locus 2 was the most significant SNP (P = 8.57 × 10−5, not adjusted for rs10896449). Unlike locus 1 where no known gene is located, there are two known mRNAs in locus 2; AL137479 and BC043531) (Fig 1d).
As a confirmation effort, we examined these two candidate prostate cancer loci at 11q13 in three additional study populations, including Johns Hopkins Hospital (JHH) study, American Cancer Society Cancer Prevention Study-II (CPS-II) Nutrition Cohort, and the Prostate, Lung, Colon and Ovarian (PLCO) Cancer Screening Trial. One representative SNP in each locus (rs10896449 at locus 1 and rs12418451 at locus 2) was evaluated. Consistent with previous publications (1–4), a significant association was found for SNP rs10896449 at locus 1 in each population (Table 1). The overall P of the allelic test in these four populations was 1.57 × 10−11. Similarly, allele ‘A’ of rs12418451 at locus 2 was also consistently more common in cases than in controls in each population, and was statistically significant in these three independent populations (P = 0.002, not adjusted for rs10896449). Together with CAPS, the overall P of the allelic test in these four populations was 1.2 × 10−6 (not adjusted for rs10896449). For both SNPs, there was no evidence for heterogeneity in allelic associations among these study populations using a Breslow-Day test for homogeneity and I2.
These two SNPs were in moderate LD in each study population (P < 0.05), with a pair-wise r2 = 0.10, 0.16, 0.12, and 0.13, respectively in CAPS, JHH, CPS-II, and PLCO. However, when the independence of these two SNPs with prostate cancer risk was tested in all four populations by including both SNPs in a logistic regression model adjusted for study population and age, both SNPs remained significant; P = 0.0001 for rs10896449 at locus 1 and P = 0.01 for rs12418451 at locus 2. No significantly multiplicative interaction between the two SNPs was found, P = 0.9.
The frequencies of risk alleles at these two SNPs were not significantly different between aggressive cases and non-aggressive cases in CAPS, JHH, CPS-II, or PLCO (Supplementary Table 4). These two SNPs were not significantly associated with plasma PSA levels among control subjects; using an additive model, P = 0.11 and 0.46, respectively in CAPS and JHH for rs10896449 at locus 1 and P = 0.24 and 0.34, respectively in CAPS and JHH for rs12418451 at locus 2. PSA levels in controls subjects were not available to us for CPS-II and PLCO.
Using a fine mapping approach to study a newly discovered prostate cancer locus at 11q13, we found a second independent locus in the flanking region of the first 11q13 locus in our study populations and in the publicly available CGEMS (PLCO) study. Among a total of four study populations examined in this study, the overall P of the allelic test for this novel locus (rs12418451) was 1.2 × 10−6. Although the nominal P value was highly significant, it did not reach a genome-wide significance level of 2 × 10−8 that accounts for multiple tests of ~2 million SNPs in the genome. Evaluation of this novel locus in additional study populations is needed. This study further demonstrates the importance of sharing and utilizing all available data to identify risk factors associated with complex diseases.
If this second locus at 11q13 is confirmed in additional study populations, it would be the third example, following 8q24 (5–10) and 17q12 (11–12), where additional independent loci are subsequently discovered in the flanking regions of an initial locus identified from GWAS. The molecular mechanism for this phenomenon is unknown; however, it suggests fine mapping studies in broad regions surrounding SNPs implicated from GWAS are needed for other prostate cancer risk associated regions reported in recent GWAS. Considering that the novel loci at 8q24, 17q12, and 11q13 are in different haplotype blocks from their respective initial locus, fine mapping studies should extend across several haplotype blocks in the flanking region.
The frequencies of risk alleles at these two loci at 11q13 were not significantly different between aggressive cases and non-aggressive cases. This null result with aggressiveness of prostate cancer is similar to the other prostate cancer risk variants recently identified from GWAS, including at 8q24, 17q12, 17q24, 3p12, 7q21, 10q11, and Xp11 (3,22). Lack of significant differences in risk allele frequencies between aggressive and non-aggressive cases is not unexpected considering that these prostate cancer risk variants were identified by comparing all types of prostate cancer cases with controls. Other study designs, including those comparing aggressive with non-aggressive prostate cancer may be more appropriate to discover risk variants for aggressive prostate cancer.
A potential confounder of PSA detection bias in genetic association studies of prostate cancer was recently suggested (23). A SNP at 19q13 (rs2735839) near the KLK3 gene (also known as the PSA gene) was found to be associated with prostate cancer risk in a GWAS (2). However, this SNP was also found to be significantly associated with PSA levels in controls (2,22,23). The PSA association, when combined with widely used PSA screening in some populations, may lead to the alleles associated with higher PSA levels to be over-represented in cases and under-represented in controls. An artifact association with prostate cancer may occur for these SNPs regardless the alleles are truly associated with prostate cancer risk per se. We examined the association of the two SNPs at 11q13 with PSA levels in controls and did not find statistical evidence for the association (Supplementary Table 5); therefore, the prostate cancer association at 11q13 reported in this study are unlikely confounded by PSA detection bias.
Similar to 8q24 and other newly discovered prostate cancer risk associated loci; no obvious candidate genes are located within locus 1 at 11q13. However, there are several putative mRNAs that originate from regions in close proximity to locus 2: rs12418451 is within an intron of a spliced EST, BC043531, cloned from a human brain cDNA library, and it is within 6kb of the 3’end of AL137479, a variant transcript of TPCN2, a gene with coding SNPs recently associated with pigmentation variation in Europeans (24). The roles of these genes in the development of prostate cancer remain to be evaluated. However, the discovery of two novel prostate cancer risk associated variants at 11q13 further demonstrates the advantage of systematic and objective evaluation of data from GWAS and fine mapping studies, and the potential to uncover new mechanisms for the etiology of prostate cancer.
The finding of additional prostate cancer risk variants in this study provides additional support for the polygenic nature of the disease. As more prostate cancer risk variants are identified, we may gradually improve our ability to predict individual prostate cancer risk using genetic markers (25–26).
The authors thank all the study subjects who participated in the CAPS study and urologists who included their patients in the CAPS study. We acknowledge the contribution of multiple physicians and researchers in designing and recruiting study subjects, including Dr. Hans-Olov Adami (for CAPS) and Drs. Bruce J. Trock, Alan W. Partin, and Patrick C. Walsh (for JHH).
The authors also thanks for the National Cancer Institute Cancer Genetic Markers of Susceptibility Initiative (CGEMS) for making the data available publicly.
The work was supported by National Cancer Institute [grant numbers CA129684, CA105055, CA106523, CA95052 to J.X., CA112517, CA58236 to W.B.I.]; Department of Defense [grant number PC051264 to J.X]; Swedish Cancer Society (Cancerfonden) to HG; and Swedish Academy of Sciences (Vetenskapsrådet) to HG. The support of Kevin P Jaffe to W.B.I is gratefully acknowledged.