In this study, we used LCLs from the well-genotyped International HapMap collection and identified 18 unique SNPs that associate with carboplatin sensitivity from >2 million SNPs. One of these was replicated in a set of independent LCL samples (Bonferroni corrected p<5×10−2). The SNP of interest (rs1649942) shows a r2 of 0.20 or 0.23 with carboplatin IC50 in CEU discovery and validation samples, respectively, suggesting this SNP is explaining about 20% of the phenotypic variation. We found that this SNP is associated with PFS and OS in phase I analysis of 377 Australian ovarian cancer patients who received at least 4 cycles of carboplatin-based chemotherapy. However, in a larger, second phase of evaluation of patient samples, we did not replicate these findings. The potential mechanism of action this SNP in LCLs may be through its association with the expression of 18 target genes (e.g., ALDH2 and KYNU using a stringent Bonferroni cutoff). Ten of these target gene expression traits are also correlated with carboplatin sensitivity in LCLs.
There is a pressing need to identify germline variation that predicts response to standard therapy for advanced ovarian cancer (platinum plus taxane) since the 5-year survival rate is approximately 45%. In fact, ovarian cancer kills approximately 15,000 women in the United States every year, and more than 140,000 women worldwide (25
). Thus, identifying those at risk for non-response to certain chemotherapy allows for the possibility of administering alternative chemotherapy and potentially improving treatment outcomes.
An alternative approach to “personalized medicine” is to identify a set of gene expression signatures instead of genetic variants. In fact, a 14-gene expression predictive model was developed to predict early relapse in women with advanced ovarian cancer and treated with platinum-taxol (26
). However, evaluating gene expression in tumors of patients is cumbersome, variable and expensive. A candidate gene approach has also been attempted to identify genetic markers that predict ovarian cancer treatment outcomes but failed in producing unequivocal results (27
). GWAS provides an unbiased approach to evaluate all genetic variation in the genome that may contribute to disease risks (28
) and/or drug response (31
). Therefore, we employed GWAS in a cell-based model to identify germline variants with clinical applicability. The in vitro
model system could be applied to other toxic drugs that would be difficult, if not impossible, to study in non-diseased patients. GWAS identified SNPs in this study that would not have been likely “candidate SNPs” based on the drug’s pharmacokinetics, pharmacodynamics or mechanism of action.
In LCLs, the rs1649942 SNP is within the neuregulin 3
) gene, which has been shown to activate the tyrosine phosphorylation of its cognate receptor, ERBB4
, and is thought to influence neuroblast proliferation, migration and differentiation by signalling through ERBB4
). SNPs within the NRG3
gene have been implicated in heart failure mortality (34
); schizophrenia (35
); and ADHD (36
). However, the NRG3
gene itself was not well represented using the exon array. In efforts to interrogate the genomic region more closely, we used whole-genome sequence data from the 1000 Genomes project and identified 4 additional SNPs in moderate LD (r2
>0.70) with our SNP in CEU; none showed more significant association with carboplatin IC50
. Furthermore, we explored the possibility that the SNP distantly regulates other genes in the genome to achieve its effect. Indeed, rs1649942 genotype is strongly associated with more than 10 transcriptional expression traits, suggesting that it may be a genomic master regulator (23
). A simple base pair change at this locus may produce a cascade of expression signal changes, resulting in phenotypic variation (in our case, patient survival post carboplatin treatment).
Interestingly, we found both the SNP (rs1649942) and one of its target genes (ALDH2
, a mitochondrial isoform of aldehyde dehydrogenase) were also associated with sensitivity to cisplatin, another commonly used platinating agent, in our LCL model (10
). A recent report showed the higher expression of ALDH1
, a cytosolic isoform of the aldehyde dehydrogenases family, is associated with higher response to chemotherapy, longer disease-free survival and OS time in ovarian cancers (37
). In agreement, we found that higher ALDH2
expression was significantly correlated to sensitivity to platinum-induced cytotoxicity in both LCLs and ovarian cancer cell lines. ALDH1
was not also identified in our model due to the lack of expression of this gene in HapMap CEU samples.
Despite its many advantages over other approaches, GWAS may suffer from a high rate of false discovery. Therefore, NCI and NHGRI have jointly published a set of guidelines suggested to be used in designing a replication study following GWAS (15
). Our study sought to adhere closely to these guidelines and encompassed not only an independent set of in vitro
replication samples, but in vivo
clinical samples for validation as well. In our phase 1 in vivo
study using 377 AOCS patients, the risk allele of rs1649942 was associated with a modest increased risk of disease progression and death following carboplatin-based chemotherapy, with an even greater genetic contribution for both PFS and OS among a subset of patients with optimally-debulked tumors. The reason for the greater effect in this subset is not entirely clear, but this result mirrors our previous observation that an association between PFS and the ABCB1
2677G>T/A SNP was only seen in women with minimal residual disease (18
). Since clinical outcomes obtained from optimally-debulked patients may represent the ideal scenario in which to isolate effects due primarily
to chemotherapy from the confounders associated with residual disease, the effect of rs1649942 in these particular patients is of interest but it should be noted that this result was based on small numbers of patients. There were no significant associations observed between rs1649942 genotype and factors related to prognosis in ovarian cancer, including patient age, stage, histology and residual disease, suggesting that the observed genetic effect on patient survival is likely to be related to its effect on chemotherapeutic response rather than to disease characteristics.
However, we did not replicate phase 1 findings in our phase 2 analysis, which differed to the phase 1 analyses in several ways. In phase 2 we categorized residual disease as ‘nil’ vs ‘any’ as opposed to ≤ or > 1cm so that we could include patients from sites which did not use this coding, and adjusted for grade and histology (in addition to stage which we used for the adjusted analyses in phase 1); we also included patients (n=550) presumed, rather than known, to have had standard doses of paclitaxel (175 or 135 mg/m2
) and carboplatin (area under the curve, 5 or 6) in order to increase our power. However, when we re-analyzed the phase 1 data using the same analytical method as for phase 2, we obtained similar significant associations with rs1649942. When we restricted phase 2 analysis to patients with known doses as in phase 1 (n=776), we still found no association with this SNP. While phase 1 estimates were based on small numbers and may be false discovery, it is possible that failure to observe an association with the rs1649942 SNP in phase 2 analysis may reflect differences in clinical definitions across studies that cannot be adequately accounted for in the analysis, and low power to detect an association in the 776 patients whose treatment details were known. For example, the criteria used to define disease progression varied across studies and in some cohorts (TCGA) no consistent definition of progression was used. Time to progression was the clinical outcome measure used in this study, as the measurement of ‘response’ to primary chemotherapy in ovarian cancer is confounded by the fact that chemotherapy is combined with debulking surgery. A fall in CA125 cannot distinguish between the effects of chemotherapy and the effects of surgery, and imaging can only be used to assess response in patients with measurable disease remaining at the end of surgery (i.e., not in optimally debulked patients) (38
). The clinical validation studies used self-reported ethnicity to determine the non-Hispanic whites. Since the cell-based finding on rs1649942 is specific to Caucasians, the use of self-reported ancestry is likely to include patients with differing ethnic backgrounds and potentially mask the Caucasian-specific association. This is particularly true in the second phase validation study, since patients were recruited from various sites across the world. Using ancestry informative markers to define ethnicity could potentially influence the final findings (39
Interestingly, we have recently identified a suggestive association between this SNP and therapy-induced decreases in platelets in 60 head and neck cancer patients who underwent carboplatin-based induction therapy (11
). Given this and the results of our in vitro
experiments, we therefore cannot discount the possibility that this SNP may influence chemotherapy outcomes in ovarian cancer patients. Further analysis is warranted in larger, well-characterized clinical samples.
Given the obstacles to performing large, replicable pharmacogenetic studies aimed at discovering novel variants and the clinical confounders of such, the cell-based model we developed to identify genetic variants that may predict PFS and OS in ovarian cancer patients has important implications in the field of oncology. We acknowledge the limitations of using a cell-based model for pharmacogenomic discovery but the advantages compared to attempting to perform GWAS in a clinical trial are enormous, provided large cohorts of well characterized patients are available for validation. Cell-based models are much less expensive, many of the environmental confounders can be controlled and the effects of a single chemotherapeutic agent can be studied. Therefore, our cell-based approach provides a useful alternative tool aimed at identifying clinically relevant genotype-phenotype relationships through a genome-wide approach.