|Home | About | Journals | Submit | Contact Us | Français|
A potential susceptibility locus for colorectal cancer on chromosome 9p24 (rs719725) was initially identified through a genome-wide association study, though replication attempts have been inconclusive.
We genotyped this locus and explored interactions with known risk factors as potential sources of heterogeneity, which may explain the previously inconsistent replication. We included Caucasians with colorectal adenoma or colorectal cancer and controls from four studies (total 3891 cases, 4490 controls): the Women’s Health Initiative (WHI); the Diet, Activity and Lifestyle Study (DALS); a Minnesota population-based case-control study (MinnCCS); and the Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial (PLCO). We used logistic regression to evaluate the association and test for gene-environment interactions.
SNP rs719725 was statistically significantly associated with risk of colorectal cancer in WHI (OR per A allele 1.19; 95% CI 1.01–1.40; p-trend 0.04), marginally associated with adenoma risk in PLCO (OR per A allele 1.11; 95% CI 0.99–1.25; p-trend 0.07), and not associated in DALS and MinnCCS. Evaluating for gene-environment interactions yielded no consistent results across the studies. A meta-analysis of seventeen studies (including these four) gave an OR per A allele of 1.07 (95% CI 1.03–1.12; p-trend 0.001).
Our results suggest the A allele for SNP rs719725 at locus 9p24 is positively associated with a small increase in risk for colorectal tumors. Environmental risk factors for colorectal cancer do not appear to explain heterogeneity across studies.
If this finding is supported by further replication and functional studies, it may highlight new pathways underlying colorectal neoplasia.
Colorectal cancer is a major cause of morbidity and mortality in the United States. In 2010, there will be an estimated 142,520 new cases and 51,370 deaths in the US expected from colorectal cancer (1). Colorectal cancer is the third most commonly diagnosed cancer and the third leading cause of cancer-related deaths in the US when men and women are considered separately (2), and the second leading cause for both sexes combined (3). A substantial proportion of colorectal cancer is due to genetic risk factors, with estimates of heritable effects reaching as high as 35 percent (95% confidence interval (CI) 10–48%) in a study of twins from Finland, Denmark and Sweden (4). However, less than five percent of colorectal cancer is due to known highly penetrant variants inherited in an autosomal dominant manner (5). Thus, a large proportion of the remaining uncharacterized inherited susceptibility is expected to be due to numerous low-penetrance variants (6). The use of high-throughput technology, allowing the simultaneous ascertainment of hundreds of thousands of single nucleotide polymporphisms (SNPs), particularly in the context of genome-wide association studies (GWAS), has been especially useful in exploring such potential variants.
Recent GWAS have succeeded in identifying 10 genetic regions that contain SNPs associated with colorectal cancer risk (7). One of these regions, 8q24, has been replicated in multiple studies and has also been implicated in various other cancers, such as prostate, breast, and bladder cancer (8). Along with 8q24, a second locus 9p24 was identified in the Assessment of Risk for Colorectal Tumors in Canada (ARCTIC) genome-wide scan (8). Within this region, rs719725 was identified as being the most strongly associated with colorectal cancer (OR 1.14, p = 1.32 × 10−5), although with lower statistical significance and less cross-study consistency than for 8q24 (8). However, in replication studies by these researchers, only three of seven study populations showed a statistically significant association consistent with the GWAS finding, resulting in a lower strength of association (OR 1.08, p = 0.023) (8).
A subsequent study replicated the rs719725 association with colorectal cancer using a case-unaffected sibling control design from the Colon Cancer Family Registry (9). This study found a statistically significant association among population-based families (p = 0.011), but not among clinic-based families (p = 0.97), although the difference between these two was, itself, not statistically significant (p = 0.26) (9). A later study was unable to replicate rs719725 using a combined analysis of three study populations from Sheffield, U.K., Leeds, U.K., and Utah, U.S.A. (OR 1.04, 95% CI 0.91–1.19) (10). Overall results from these subsequent replications were inconsistent, demonstrating an association in some populations while not in others (8, 9, 10).
While GWAS have identified ten SNPs associated with colorectal tumor risk, the biological functionality of these markers are generally unknown. At the 9p24 locus, SNP rs719725 does not lie within a gene itself, but there are several genes nearby. These include: protein kinase NYD-SP25 isoform 3 (TPD52L3, 37 kb telomeric), which is a member of the tumor protein D52 family; interleukin 33 (IL33, 124 kb telomeric); ubiquitin-like PHD and RING finger domain-containing protein (UHRF2, 47 kb centromeric); and glycine dehydrogenase (GLDC, 167 kb centromeric) (9). Potentially, rs719725 could also be in linkage disequilibrium with a variant that modifies a long range enhancer of any of these genes, as seen for the variants in 8q24 and MYC (11). rs719725 is upstream of UHRF2 and downstream of TPD52L3, IL33, and GLDC; it lies in a haplotype block that is ~114 kb in size, which is part of a larger block that is ~407 kb in size (Haploview 4.1, release 21, CEU population) (12). The small block contains TPD52L3; the large block includes TPD52L3, IL33, and UHRF2. As with other tag SNPs, associations between rs719725 and colorectal tumors may be due to linkage disequilibrium between the genetic marker and the true susceptibility allele, or alleles, at a neighboring locus. Beyond risk identification, a recent study found rs719725 to be statistically significantly associated with time to tumor recurrence in adjuvant colorectal cancer patients (13), suggesting this SNP may also be potentially important for prognosis and survival.
In order to further investigate whether the genetic variant on 9p24 is associated with colorectal tumors, we examined the impact of rs719725 on both colorectal adenoma and cancer, allowing the capture of a broader spectrum of colorectal tumor development. Because the large majority of colorectal malignancies develop from an adenomatous polyp (14), any genetic influence on these precursor lesions is likely to have an effect on cancer as well. We examined this association in four well-characterized study populations. These studies allowed for a detailed investigation of potential interactions with environmental risk factors known to be associated with colorectal cancer risk, including obesity, physical activity, non-steroidal anti-inflammatory drug use, smoking, and diet. Such factors may contribute to heterogeneity between studies, and may help to explain the inconsistent findings previously reported.
To investigate the full spectrum of colorectal cancer we included both colorectal adenoma and colorectal cancer from four study populations (total 3891 cases, 4490 matched controls): the Women’s Health Initiative (WHI: 656 colorectal cancer cases, 664 matched controls); the Diet, Activity and Lifestyle Study (DALS: 1461 colon cancer cases, 1813 matched controls); a Minnesota population-based case-control study (MinnCCS: 517 colorectal adenoma cases, 628 matched controls); and the Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial (PLCO: 1257 colorectal adenoma cases, 1385 matched controls).
The WHI Observational Study (OS) is a prospective cohort study, which enrolled 93,676 postmenopausal women aged 50–79 years living near 40 clinical centers across the US from 1993 to 1998, with continuous follow-up (15). We included invasive incident colorectal cancer cases diagnosed up through Sept 12th 2005. Controls were randomly selected and individually matched on age at screening, enrollment date, ethnicity, hysterectomy status, and prevalent colorectal cancer at baseline (16), using risk-set sampling (17).
DALS is a population-based case-control study of colon cancer (18). Cases and controls were recruited from the Kaiser Permanente Medical Care Program of Northern California, an eight-county area in Utah, and the metropolitan Twin Cities of Minnesota. Cases had been diagnosed with first primary colon cancer between October 1, 1991 and September 30, 1994. Cancers of the rectosigmoid junction or rectum were not included in this study. Controls were matched to cases by 5-year age groups and sex, and came from membership lists (Kaiser), driver license lists (Utah up to age 65, Minnesota), Health Care Financing Administration lists (Utah over age 65), or state identification lists (Minnesota).
The MinnCCS is a case-control study within the Minnesota Cancer Prevention Research Unit (19). Cases and controls, aged 30–74 years, were recruited from patients that were scheduled for colonoscopy from April 1991 to April 1994 within a large, multi-clinic gastroenterology practice in metropolitan Minneapolis. Cases were subjects found to have adenoma at the time of colonoscopy, and controls were subjects found to be without adenomatous or hyperplastic polyps at colonoscopy (20).
The PLCO is a randomized trial of ~155,000 persons aged 55–74, enrolled during 1993–2001 from 10 US centers, to evaluate the effectiveness of screening on cancer mortality (21). We conducted a nested case-control study among participants randomized to the screening arm, who underwent a 60-cm flexible sigmoidoscopy examination at study entry. Cases included participants with advanced colorectal adenoma of the distal colon or rectum (adenoma ≥ 1 cm in size, diagnosed as: villous/tubulovillous; high-grade dysplasia; or carcinoma in situ). Controls were participants who had a successful baseline screening exam that was negative for polyps in the distal colon and rectum. Controls were frequency-matched to cases on ethnicity, sex, and, for a subset, age (21).
Further details regarding the methods used for selection of subjects and data collection in these studies have been previously reported. Given the small proportion of non-Caucasians in these studies (2.9% to 16.2%) all four datasets were restricted to Caucasians only, which the numbers above reflect. All reported studies obtained informed consent and received IRB approval.
For the WHI and DALS studies, genotyping was performed by MALDI-TOF mass spectrometry on the Sequenom MassARRAY 7K platform using the iPLEX Gold (low-plex) reaction (16). For the PLCO and MinnCCS studies, genotyping was performed using the TaqMan assay system: PLCO used the Fludigm Biomark system (Fluidigm, South San Francisco, CA, USA) and MinnCCS used an ABI 7900HT instrument (ABI, Foster City, CA, USA) (20). These methods have been previously reported and described in further detail elsewhere (16, 20). Quality-control measures in these studies included call-rate cutoffs, blinded duplicates, and other standard methods. In all four studies, the minor allele frequency (MAF; C allele) was ~37% and controls were in Hardy-Weinberg equilibrium (HWE; p > 0.18).
We calculated odds ratios (ORs) and 95% confidence intervals (95% CIs) to estimate the association between SNP rs719725 and colorectal neoplasia. We conducted logistic regression analyses for each study separately, including study-specific covariates, such as age, sex, and study center, as appropriate. To be consistent with most previous studies, we used the minor allele C as the reference allele. We calculated risk based on the log-additive model, by including a single variable coded as 0, 1, or 2, reflecting the number of copies of the major A allele. We also evaluated risk for an unrestricted model, without any assumption of the underlying genetic model, which provides OR estimates comparing genotypes AC vs. CC and AA vs. CC. We combined the individual study-specific adjusted log-ORs using a linear mixed-effects model and then estimated the overall ORs and 95% CIs (22).
To investigate whether the association between rs719725 and cancer differed by anatomical location, we also performed stratified analyses for colon vs. rectum. Cases with tumors located at both the colon and the rectum and those with unknown location were excluded from these stratified analyses. Polytomous regression was used to evaluate statistical differences among models. Since WHI only included females, we also evaluated these associations stratified by sex.
Risk factors potentially related to colorectal cancer were evaluated for interaction with rs719725 using likelihood ratio tests. The pertinent risk factors considered were: family history of colorectal cancer (having at least one first degree relative, yes/no); cigarette smoking (never/former/current and pack-years); use of non-steroidal anti-inflammatory drugs (NSAIDs, yes/no); alcohol consumption (g/day); physical activity (hours of moderate/vigorous exercise per week); body-mass index (BMI, kg/m2); folate intake (total folate in mcg/day), calcium intake (mg/day), red meat intake (g/day); and use of postmenopausal hormones (never/former/current and duration). Dietary risk factors were adjusted for daily caloric intake by including this term in the regression models used for the likelihood ratio tests. Risk factors with a statistically significant test in at least two studies were considered to lead to potential interaction effects, and OR estimates were calculated stratified on that factor. To assess the overall effect of the interaction we also performed random-effects meta-analyses of the betas and standard errors of the interaction terms for each risk factor.
We performed a random-effects meta-analysis combining our four studies with thirteen results presented in three previously published studies on rs719725 and colorectal cancer (8, 9, 10). For this combined analysis, we included the log-additive OR estimates and 95% CIs for each study population; for a few published results we needed to transform from A allele referent to C allele referent (10), or from an unrestricted model to a log-additive model (9). Forest plots were used to display the results from individual studies, as well as the summary results. The statistical significance of between-study heterogeneity was evaluated using Cochran’s Q statistic (23). If the p-value was less than 0.10, the heterogeneity was considered statistically significant. We also quantified heterogeneity using the I2 metric. I2 takes values between 0 percent and 100 percent, with higher values indicating higher levels of heterogeneity (24). Potential for publication bias was assessed using Egger’s test and visual inspection of funnel plots (25).
Analyses were performed using SAS version 9.1 (SAS Institute, Cary, NC), STATA version 11 (StataCorp, College Station, TX), and HaploView version 4.2 (12). Our statistical significance cutoff was a p-value of 0.05, and for marginal statistical significance was 0.10.
Demographic information for each study is shown in Tables 1a and 1b. Participants in the four studies had a mean age of 57.9 to 67.1 years. By definition the WHI study population was 100% female, whereas the proportion female was 45.5% in DALS, 50.7% in MinnCCS, and 35.8% in PLCO. As cases and controls were matched on age and sex, there were no substantial differences between cases and controls on these variables. Due to restriction, all participants were self-reported Caucasian. As expected, cases were more likely to have a positive family history of CRC, except in MinnCCS. The difference in family history in MinnCCS is plausibly explained by the fact that participants were selected among those that elected to get screened for CRC: for cases, indications for screening are likely to be symptom- or disease-related, whereas among controls, screening was more likely sought by the worried well with a positive family history. Except for WHI, cases tend to be more likely to be current or former smokers. The proportion of smokers was generally similar across these populations. Rectal cancers made up 18.2% of the WHI cases and 0% of the DALS cases; rectal adenoma made up 16.3% of the MinnCCS cases and 22.3% of the PLCO cases (Table 2).
In the WHI study, rs719725 was statistically significantly associated with risk of colorectal cancer, with an OR of 1.19 per A allele (95% CI 1.01–1.40; p-trend 0.04; Table 3). Similarly, the PLCO data provided a marginally statistically significant finding in the same direction for risk of advanced adenoma (OR 1.11 per A allele; 95% CI 0.99–1.25; p-trend 0.07). In both MinnCCS and DALS, rs719725 was not associated with risk of colorectal tumors. The OR estimates from the unrestricted models were generally consistent in their direction and trend with the results from the additive models.
To further evaluate the association between rs719725 and colorectal neoplasia risk, we performed logistic regression analyses stratified by tumor location (Table 3). These stratified analyses did not show any statistically significant differences between colon and rectal tumors in any study (p for difference > 0.07). Stratifying these analyses by sex did not produce any significantly different results (Table 4).
Only three interactions were found to be statistically significant, and all were in MinnCCS (Table 5). These interactions were between rs719725 and each of folate intake (p = 0.02); HRT use (p = 0.001); and duration of HRT use (p = 0.01).
A meta-analysis of the four studies presented here and 13 previous studies (total n = 17, Figure 1) resulted in an overall OR of 1.07 (95% CI 1.03 – 1.12) per A allele (p = 0.001), showing a statistically significant association between rs719725 and colorectal neoplasia across the study populations. There was little evidence for between study heterogeneity (I2 = 24.4%, heterogeneity p = 0.17). Information (where available) on the number of cases and controls, minor allele frequency (MAF), HWE and sex of the other studies included in the meta-analysis are shown in Table 6.
Because the interpretation of the results in MinnCCS may be impacted by the large fraction of controls with a positive family history, given that controls were selected among those that chose to be screened for CRC, we also obtained results excluding this study. Taking the MinnCCS study out of the meta-analysis, however, did not produce substantially different results (OR 1.08 per A allele; 95% CI 1.03 – 1.13; p = 0.001). Due to potential differences between adenoma and cancer we also obtained results among studies of cancer only. Taking the adenoma studies PLCO and MinnCCS out of the meta-analysis, however, also did not produce substantially different results (OR 1.08 per A allele; 95% CI 1.03 – 1.13; p = 0.002).
Among four well characterized epidemiologic studies we found a statistically significant and marginally significant association in two nested case-control studies of colorectal cancer and adenoma whereas two other case-control studies of colorectal adenoma and cancer did not show evidence for association. We were not able to explain the differences in our results by tumor site or by interactions with environmental risk factors for colorectal cancer. A meta-analysis of 17 studies (including these 4) showed a statistically significant association for this SNP with colorectal tumor. It is important to note that, although the primary distinction among the four studies presented here is the finding of an association in the cohort, but not the case-control studies, that is not what characterizes the rest of the data. Indeed, the first study to report the association was a case-control study.
Our results are similar to other publications that showed inconsistent findings for rs719725 and risk of colorectal cancer. Summarizing findings in a meta-analysis of all available results provided evidence for a positive association between the rs719725 A allele and colorectal tumor (OR 1.07, p = 0.001). However results do not reach genome-wide significance levels of 10−7 to 10−8 (26, 27). It is probably important that only 5 of the 17 reported risk estimates showed no positive association and only 2 of these showed an inverse association (Figure 1). We observed no statistical evidence for heterogeneity, though we may not have had the necessary power to detect it (24, 28). A funnel plot of the estimates of the association used in the meta-analysis did not demonstrate any evidence of publication bias (Egger’s test p = 0.68), though our power to detect such bias may be limited (Figure 2).
The weak OR estimates found in this study are not surprising, given the generally weak risk estimates commonly found in GWAS. Although it seems likely that this particular SNP carries only a slight increase in risk of colorectal tumor, its identification, if true, would nevertheless be important for identifying potentially new pathways relevant to preventive and treatment strategies. Furthermore, in concert with many other susceptibility loci, rs719725 could potentially be used to measure an overall risk profile for colorectal tumor. A recent review estimated that ~172 SNPs account for all of the genetic variance for colorectal cancer (7). In this way, rs719725 could potentially resemble a single small piece of the large colorectal cancer genetic risk profile puzzle; the problem of what is sufficient and what is necessary is likely to remain unresolved for a long time.
Among the four study populations, we did not find any consistent evidence for any gene-environment interaction. We observed only three statistically significantly interactions in one study (MinnCCS), for rs719725 and folate intake and HRT use and duration, but these were not replicated in the other three study populations. These three findings do not remain statistically significant after adjustment for multiple comparisons. Further, the lack of a clear pattern across all four studies, along with observing these interactions in the smallest of the four studies, suggests that these findings are likely to be spurious. Our hypothesis that these environmental factors could explain some of the heterogeneity between studies is not supported by our data, though we may have been underpowered to explore such effects or perhaps did not evaluate the correct environmental factor. We included essentially all of the currently known CRC environmental factors. The biological function of this SNP is unknown, and therefore we did not have an a priori hypothesis to examine interactions with specific environmental factors. We chose the approach of evaluating all established environmental risk factors as potential effect modifiers, with the goal of determining if these factors are potentially responsible for the inconsistent results reported in previous papers.
A limitation of this study is the inability to investigate gene-gene interactions, which may be important for this locus. It is possible that such interactions may play a role in explaining the different results across studies, and future research will be necessary to determine how risk is impacted by considering multiple genetic variants simultaneously. We examined only the previously reported SNP in this region and it is possible that other variants are more relevant. While we restricted study participants to only include Caucasians, population stratification is still a potential reason for the inconsistent findings reported across studies. The definition of an adenoma case was also more broadly interpreted in MinnCCS than in PLCO, and this difference in definition could have contributed to the inconsistent association found in these two studies.
Strengths of this study include the use of multiple studies with large sample sizes and the availability of detailed information on environmental risk factors for colorectal cancer, allowing us to evaluate if environment factors modify the association between this SNP and colorectal tumor.
Although we can neither prove nor disprove the existence of a risk locus on chromosome 9p24 that is tagged by SNP rs719725, our analysis, as well as the meta-analysis including previously published studies, provide additional support for an association between variation at 9p24 and colorectal neoplasia. Further replication in larger studies seems to be warranted. If the association continues to be replicated, this could be followed by studies to identify the disease-related polymorphism being tagged and its possible function. It also may be pertinent to evaluate this association among different races and ethnicities, and to further explore potential interactions with other genetic or environmental factors.
The project described was supported by Award Numbers R01 CA059045 and R01 CA120582 from the National Cancer Institute. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Cancer Institute or the National Institutes of Health. This research was also supported by the Intramural Research Program of the Division of Cancer Epidemiology and Genetics and by contracts from the Division of Cancer Prevention, National Cancer Institute, NIH, DHHS. The authors would like to thank Dr. Roberd Bostick and Ms. Lisa Fosdick for their contributions to the study design and data collection in the MinnCCS. The authors thank Drs Christine Berg and Philip Prorok, Division of Cancer Prevention, National Cancer Institute, the Screening Center investigators and staff of the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial, Mr. Tom Riley and staff, Information Management Services, Inc., Ms. Barbara O’Brien and staff, Westat, Inc., Mr. Tim Sheehy and staff, DNA Extraction and Staging Laboratory, SAIC-Frederick, Inc., and Ms. Jackie King and staff, BioReliance, Inc. Most importantly, we acknowledge the study participants for their contributions to making this study possible.