|Home | About | Journals | Submit | Contact Us | Français|
Twin studies suggest a heritable component to circulating sex steroid hormones and sex hormone-binding globulin (SHBG). In the NCI-Breast and Prostate Cancer Cohort Consortium, 874 SNPs in 37 candidate genes in the sex steroid hormone pathway were examined in relation to circulating levels of SHBG (N = 4720), testosterone (N = 4678), 3α-androstanediol-glucuronide (N = 4767) and 17β-estradiol (N = 2014) in Caucasian men. rs1799941 in SHBG is highly significantly associated with circulating levels of SHBG (P = 4.52 × 10−21), consistent with previous studies, and testosterone (P = 7.54 × 10−15), with mean difference of 26.9 and 14.3%, respectively, comparing wild-type to homozygous variant carriers. Further noteworthy novel findings were observed between SNPs in ESR1 with testosterone levels (rs722208, mean difference = 8.8%, P = 7.37 × 10−6) and SRD5A2 with 3α-androstanediol-glucuronide (rs2208532, mean difference = 11.8%, P = 1.82 × 10−6). Genetic variation in genes in the sex steroid hormone pathway is associated with differences in circulating SHBG and sex steroid hormones.
Twin studies suggest a heritable component to circulating sex steroid hormones and sex hormone-binding globulin (SHBG). The sex steroids and SHBG have critical roles in human development, behavior (1) and potentially, prostate carcinogenesis (2). With the completion of the Human Genome (3) and the HapMap Projects (4) providing the genomic basis and the assembly of a large epidemiologic research resource in the NCI-Breast and Prostate Cancer Cohort Consortium (BPC3) (5), we were able to identify quantitative trait loci predicting sex steroid levels in Caucasian men, by pooling data from five independent cohorts in the BPC3.
The NCI-BPC3 is a large multicenter association study (5) that systematically explores the role of genetic variation in genes encoding for the sex steroid hormones in the etiology of breast and prostate cancer, including the American Cancer Society Cancer Prevention Study II (CPS-II), the Alpha-Tocopherol, Beta-Carotene Cancer Prevention (ATBC) Study, the European Prospective Investigation into Cancer and Nutrition (EPIC), the Health Professionals Follow-Up Study (HPFP), the Hawaii-Los Angeles Multi-ethnic Cohort (MEC) Study, the Physicians Health Study (PHS); and the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial. For 37 genes in the sex steroid hormone pathway (Fig. 1; Supplementary Material, Table S1 for the list of genes), we genotyped 874 SNPs and related them to sex steroid hormone levels, and imputed an additional 3191 SNPs, for supplemental analyses. The relationships between SNPs and sex steroids (SHBG [nmol/l, N = 4720], testosterone [ng/ml, N = 4678], 3α-androstanediol-glucuronide [ng/ml, N = 4767] and 17β-estradiol [pg/ml, N = 2014]) were assessed using linear regression, after log transformation of hormone values, adjusting for body mass index (BMI), age, prostate cancer case–control status and cohort. Age and BMI were associated with SHBG, testosterone and 3α-androstanediol-glucuronide levels, consistent with previous studies (6,7), although case–control status was unrelated to these steroid levels (Fig. 2; inter-quartile ranges of sex steroids according to each covariate). There was some heterogeneity in the measured sex steroid levels between cohorts, because the steroid hormones for cohorts were assayed in different laboratories at different points in time (Supplementary Material, Table S2; hormone assay methods).
Several SNPs in SHBG on chromosome 17p13 were associated with circulating SHBG (Fig. 3A), most strongly marked by genotyped SNP rs1799941 (Fig. 4, blue circles), located 67 base pairs upstream from the transcription start site of SHBG and conferring a 25% mean difference in blood levels between variant and wild-type homozygotes (P = 4.52 × 10−21; geometric means for wild-type, heterozygote, homozygote = 42.63, 49.01 and 54.10 nmol/l, respectively; Table 1). The relationship of rs1799941 with circulating SHBG was consistent across strata of five independent cohorts (Table 1), age, BMI and case–control status (Supplementary Material, Table S3; all P for heterogeneity >0.05). Although eight other genotyped SNPs in SHBG were also strongly associated with circulating SHBG (Fig. 4; Supplementary Material, Table S4; P < 10−5), when these eight SNPs [which were not strongly correlated with rs1799941 (r2 < 0.7)] were simultaneously included in the multivariable model with rs1799941, three SNPs remained significantly associated with circulating SHBG: rs1799941 (P = 3.66 × 10−7), rs6257 (P = 0.005) and rs9913778 (P = 7.98 × 10−9). In a log-additive genetic model, rs1799941 and rs9913778 explain 2.0 and 1.0% of the variance in log-transformed circulating SHBG, respectively.
SHBG is a glycoprotein that regulates steroid bioavailability by binding with highest affinity to dihydrotestosterone (DHT), followed by testosterone and then estradiol (2). Interestingly, circulating SHBG was also predicted by rs10754396 (P = 7.12 × 10−6; Fig. 3A), a SNP located in the 3′untranslated region of the HSD3B2 gene, which encodes for hydroxy-delta-5-steroid dehydrogenase, an enzyme irreversibly inactivating DHT. When rs10754396 (HSD3B2) and rs17994441 (SHBG) were simultaneously included in the multivariable model, both SNPs remained independently associated with circulating SHBG concentrations (P = 4.54 × 10−6 for rs10754396; P = 1.08 × 10−20 for rs1799941) and there was no evidence of interaction (Pinteraction= 0.45).
In addition to its association with circulating SHBG, rs1799941 in SHBG was also strongly associated with circulating levels of testosterone [P = 7.54 × 10−15; geometric means = 4.56, 5.00 and 5.21 ng/ml; Figs 3B and and44 (red diamonds), and Table 1). Consistent with our findings, a recent genome-wide association study reported an association of rs1799941 with circulating SHBG levels (P = 3.08 × 10−7) (8) and a candidate gene study reported an association of rs1799941 with testosterone (P < 0.01), in addition to SHBG levels (9).
In our study, circulating SHBG and testosterone were correlated (Spearman partial correlation r = 0.48, P-value <0.001; age, BMI and study controlled) and, since the clearance and bioavailability of testosterone are related to SHBG levels (2), genetic determinants of SHBG could be expected to also exhibit an association with testosterone. In order to disentangle the SHBG and testosterone interrelationship, we examined associations of rs1799941 in relation to SHBG adjusted-testosterone concentration, calculated by residuals of testosterone regressed on SHBG, showing that rs1799941 also tended to predict SHBG adjusted-testosterone levels (P = 1.33 × 10−3). Nonetheless, measurement error and misclassification from a single measurement of these analyses may have precluded complete adjustment for underlying physiologic (versus genetic) correlations.
We also observed notable novel associations of circulating testosterone with several SNPs in ESR1, a gene encoding for estrogen receptor-α (Fig. 3B), with the strongest association for the genotyped SNP rs722208 (P = 7.37 × 10−6), conferring −8.8% mean difference in testosterone between variant and wild-type homozygotes (Table 1). This association was consistent across studies (Table 1) and strata of age, BMI and case–control status (Supplementary Material, Table S4; all P for heterogeneity >0.05). Three other genotyped SNPs (rs3020411, rs926778 and rs2348078), located in the same linkage disequilibrium block (Fig. 5), spanning exon 5 and 7 which encode for the estrogen binding domain of estrogen receptor-α (10), were also strongly associated with circulating testosterone (Supplementary Material, Table S4; P-value <10−5); however, on simultaneous multivariable adjustment of these SNPS, none were more strongly associated than the others with testosterone, due to high linkage disequilibrium (r2 > 0.7). The statistical association of SNPs in ESR1 with testosterone may be due to physiologic correlations, as ESR1 null mutations (11) are thought to impact circulating testosterone through feedback regulation via the hypothalamic–pituitary–gonadal axis (12). Estrogen receptor-α is essential for prostate development (13). Also, an in vitro reporter gene assay suggests that an ESR1 splice variant missing exon 7, resulting from a SNP in intron 6 (rs2273207), confers decreased levels of transcription in the presence of estrogen compared with wild-type (14).
Since both rs722208 in ESR1 and rs1799941 in SHBG predicted circulating testosterone levels, we evaluated joint effect of these two SNPs. Having greater cumulative number of alleles of rs722208 (A) and rs1799941 (A) was associated with increasing testosterone levels (mean testosterone level = 4.22, 4.56, 4.73, 5.16 and 5.22 ng/ml for 0, 1, 2, 3 and 4 allele carriers, respectively; P = 8.11 × 10−6), with each SNP contributing additively to the association, as indicated by the test for multiplicative interaction (Pinteraction = 0.75).
SRD5A2 on chromosome 2p23 encodes for steroid-5-α-reductase, which converts testosterone to the androgenically more potent DHT primarily in the prostate (2). Because DHT has rapid turnover (2), its downstream metabolite 3α-androstanediol-glucuronide is considered a circulating biomarker for intraprostatic DHT synthesis. In our study, rs2208532, in intron 1 of SRD5A2, was not associated with testosterone (P = 0.94), but was associated with circulating 3α-androstanediol-glucuronide (P = 8.51 × 10−7; −11.8% mean difference; Table 1; Fig. 3C). A previous study reported rs9282858 SNPs in exon 1 (A49T) of SRD5A2 to be a predictor of 3α-androstanediol-glucuronide (P = 0.001) (15); this rare SNP (1% MAF in SNP500) was not included in our study.
Rs727479, located in intron 2 of CYP19A1, encoding for aromatase which converts androgens to estrogens, predicted blood estradiol levels (P = 5.06 × 10−5; Table 1; Fig. 3D). Consistent with our data, this SNP and another nearby SNPs (rs749292) spanning the coding and proximal 5′untranslated region of CYP19A1 were shown to be significantly associated with a 10–20% decrease in estradiol levels in postmenopausal women (16).
To further explore the loci identified with hormone levels, we imputed the common SNPs in these gene regions from HapMap. Generally, similar associations were noted for imputed SNPs (Fig. 3A–D, shown as empty circles). Two imputed SNPs, rs7755185 in ESR1 and rs559555 in SRD5A2, showed marginally stronger associations than genotyped SNPs in relation to testosterone [P = 3.78 × 10−6 for rs7755185 (imputed) versus P = 7.37 × 10−6 for rs3020411 (genotyped) in ESR1] and 3α-androstanediol-glucuronide [P = 4.79 × 10−7 for rs559555 (imputed) versus P = 1.82 × 10−6 for rs2208532 (genotyped) in SRD5A2], respectively; however, the imputed SNPs were highly correlated with nearby genotyped SNPs (r2 = 0.91 for rs7755185 and rs3020411 in ESR1; r2 = 0.95 for rs559555 and rs2208532 in SRD5A2). Results for SNPs and serum sex steroid levels were also similar when we used cohort-specific standardized Z scores for sex steroids, instead of absolute values (data not shown).
In a large consortium based study, we found a significant impact of genetic factors on circulating SHBG and sex steroid hormone levels, independent of age and BMI. We identified SNPs in SHBG associated with circulating SHBG and testosterone levels at high levels of statistical confidence. Notable novel associations were also identified relating SNPs in ESR1 and SRD5A2 with testosterone and 3α-androstanediol-glucuronide, respectively.
This is the first comprehensive characterization of genetic variation in the sex steroid pathway with circulating hormone levels. The strengths of the study include its large sample size, with findings confirmed in multiple independent populations. A potential limitation of our study is measurement of only a single circulating sex steroid measurement; hormone measures at multiple time points would have resulted in more precise estimates of circulating levels.
The underlying mechanisms for these gene-phenotype associations are unknown; they may mark for directly operating genetic causes in some cases, as is likely with gene variants in SHBG (e.g. as marked by rs1799941) and circulating SHBG, or indirectly through physiologic pathways, as may be the case for the ESR1 SNP and circulating testosterone and for SRD5A2 SNP and circulating androstanediol-glucuronide.
In summary, genetic variation in genes in the sex steroid hormone pathway is associated with differences in circulating SHBG and sex steroid hormones.
The rationale and background of BPC3 have been described elsewhere (5,17). Briefly, the prostate cancer study includes seven case–control studies nested within seven cohorts: CPS-II, the ATBC Study, the EPIC, the HPFS, the MEC, the PHS and the PLCO Cancer Screening Trial (see supplement methods for the details about the participating cohorts). Controls were matched to cases by age and ethnicity, and in some cohorts, additional matching criteria were employed (i.e. EPIC matched on country of residence). Caucasians of European descent are the predominant ethnic group in all cohorts, except the MEC. In this study, we limited our analysis of genetic variation in the sex steroid hormone pathway (Supplementary Material, Table S1; gene list) to Caucasian men with hormone measurements available (ATBC, EPIC, HPFS, PHS and PLCO). This study was approved by the institutional review boards at all institutions.
A comprehensive characterization of genetic variation in steroid hormone genes with the sequencing, high density genotyping panels, and genotyping data generated in BPC3 are publicly available at both the USC/Norris Comprehensive Cancer website (http://www.uscnorris.com/MECGenetics) and the NCI Core Genotyping facility (http://cgf.nci.gov/cohort.cfm). As described previously (17), haplotype-tags were selected for the initial steroid hormone genes based on the correlation (rH2) between the observed haplotypes and the haplotypes predicted by selected SNPs (18) using a high density genotyping panel and genotyping conducted using TaqMan assays in four cohort-specific laboratories. Genotyping of 169 haplotype tagging SNPs in 20 genes were conducted in five separate batches between June 2003 and February 2007. Information about the primers and probes for each TaqMan assay, and sample Taqman scatter plots from each genotyping center can be found on the University of Southern California website (http://www.uscnorris.com/MECGenetics). In order to substantially expand the density of genotyping and to add additional newly hypothesized genes, we selected tag SNPs using a pairwise r2 procedure (19), which was programmed to capture alleles with an r2 ≥ 0.8 and a minor allele frequency >2% from the high density genotyping panel and >5% from the phase II HapMap CEU panel. The selection of tag SNPs was limited to those with an Illumina design score >0.4 and genotyping was carried out with the Golden Gate Illumina BeadArray platform in four laboratories (USC, NCI, UK and Harvard). Thirty HapMap CEU trios were genotyped in all labs to examine inter-lab reproducibility and the concordance was 99.5%. Within each study, blinded duplicate samples (~5%) were included with concordance estimates of 97.2–99.9% across the studies. Approximately 94% of the SNPs attempted yielded reliable genotypes. Greater than 95% of the polymorphisms yielded genotyping success rates greater than 90%.
Data from the Taqman and Golden Gate platforms were filtered separately. Any sample where greater than 25% of the SNPs attempted on a given platform failed was removed from the data set. Data were then filtered by study to remove poorly performing SNPs: all SNPs that failed on 25% or more samples were excluded from the data set, as were all SNPs that showed statistically significant (P < 10−5) deviations from HWE genotype frequencies among European-ancestry controls, and all SNPs with MAF < 1%. Any SNP that was missing or excluded in more than three studies or exhibited large differences in European-ancestry allele frequencies across cohorts (Fst > 0.02) was excluded from further analysis. Average 90% of SNPS per gene after QC were included.
We imputed SNPs that were polymorphic in the 37 gene regions in any of the HapMap reference panels using observed genotypes from the BPC3 subjects and phased haplotypes from HapMap samples (release #21) using the MACH2QTL program (Y. Li et al., submitted for publication, 20). Imputation was performed stratified by study and ethnicity, and the imputed data were filtered by study and ethnicity to remove poorly imputed SNPs. SNPs with an estimated correlation between the imputed and true genotypes of less than 30% were excluded from analysis (Y. Li et al., submitted for publication). Any imputed SNP that was excluded in more than three European-ancestry strata was excluded from analysis.
Levels of androstanediol-glucuronide (ng/ml), testosterone (ng/ml), estradiol (pg/ml) and sex hormone binding globulin (SHBG; nmol/l) were previously measured in blood samples of prostate cancer cases and control subjects from prospective cohort studies [SHBG (nmol/l, N = 4720), testosterone (ng/ml, N = 4678), 3α-androstanediol-glucuronide (ng/ml, N = 4767) and 17β-estradiol (pg/ml, N = 2014)] (7,21–24). The blood samples were collected at study baseline in these cohorts, before prostate cancers developed. Detailed information on sex steroid hormone assay methods and coefficients of variations are shown in the Supplementary Material, Table S2.
A total of 874 genotyped SNPs and an additional 3191 imputed SNPs from 37 genes in the sex steroid hormone pathway were available for the data analysis (Supplementary Material, Table S1). Analysis was limited to Caucasian men with sex steroid assays completed. The relationships between SNPs and sex steroids were assessed using linear regression, after log transformation of hormone values, adjusting for BMI, age, prostate cancer case–control status and cohort. The log transformation both improved normality of the distribution of the assay values and helped to stabilize the variances of the assays performed using different methodologies. We tests for heterogeneity among cohorts, based on Q statistics (25). To disentangle the SHBG and testosterone correlation, we calculated SHBG adjusted-testosterone levels, by creating residuals of testosterone regressed on SHBG, and examined associations of SHBG variants with the SHBG adjusted-testosterone level. Pairwise linkage disequilibrium measures (D′ and r2) were estimated using the program, Haploview (http://www.broad.mit.edu/personal/jcbarret/haploview/). All statistical analyses were conducted using SAS version 9.1, software (SAS Institute, Inc., Cary, NC, USA) and PLINK (http://pngu.mgh.harvard.edu/purcell/plink/). R was used to generate the figures.
National Cancer Institute cooperative agreements UO1-CA98233, UO1-CA98710, UO1-CA98216 and UO1-CA98758 and by the Intramural Research Program of NIH/National Cancer Institute, Division of Cancer Epidemiology and Genetics.
Conflict of Interest statement. None declared.