|Home | About | Journals | Submit | Contact Us | Français|
Twenty to 40% localised RCC patients may experience recurrence after curative surgery. Limited miRNA predictors have been identified for ccRCC recurrence.
Through a multi-phase study design, we analysed miRNAs in tissues obtained from 203 ccRCC patients. Paired t-test was used for tumour–normal comparisons and Cox regression model was performed to compute hazard ratios (HRs) and corresponding 95% CIs.
A 17-miRNA signature was identified that can concordantly classify >95% of tumour/adjacent normal samples. Significant enrichment was found as 6 out of 17 miRNAs were associated with obesity (binomial probability=0.001). Decreased levels of miR-204-5p and miR-139-5p were each associated with an approximately three-fold increased risk of recurrence (P<0.01). Risk score was generated based on expressions of miR-204-5p and miR-139-5p, and the trend test was significant in both discovery and validation sets (Pfor trend<0.05). Striking MST reduction was observed for patients with a high-risk score (high vs low: discovery, 9.4 vs >97.7 months; validation, 20.8 vs >70.3 months). Expressions of miR-204-5p were also associated with body mass index (β=5.64, P<0.001). Significant inverse correlations were observed and validated between miR-204-5p and 13 obesity-related genes (r<0, P<0.01).
We identified 17 miRNAs dysregulated in ccRCC tissues and showed that low expressions of miR-204-5p and miR-139-5p were associated with the higher risk of recurrence. The link between miR-204-5p and ccRCC recurrence may be partially mediated by regulating the expression of targeted obesity-related genes.
Kidney cancer remains one of the top 10 most common cancers for both males and females in the United States (Siegel et al, 2016). Over 80% of kidney cancers are composed of renal cell carcinoma (RCC) and the major histological type is clear cell RCC (ccRCC). Smoking, history of hypertension, and obesity are the three established risk factors for RCC (Chow et al, 2010). It has been estimated that ~20–40% of localised patients may experience disease recurrence after curative therapy (Janzen et al, 2003) and five-year survival rate is only 11.7% for the patients with distant metastasis (Howlader et al, 2013). Previous studies have reported that clinical stage (Fergany et al, 2000; Leibovich et al, 2003; Russo et al, 2008), grade (Leibovich et al, 2003; Russo et al, 2008), tumour size (Leibovich et al, 2003), and microvascular invasion (VanPoppel et al, 1997) are associated with the recurrence or progression. Interestingly, in contrast to its association with an increased risk for RCC development, being obese may reduce the risk of recurrence and increase the overall survival (Yu et al, 1991; Parker et al, 2006). However, the complete set of predictors for RCC recurrence has not been well established, underscoring the need for new biomarkers to identify those at high risk.
MiRNAs are a class of noncoding RNAs of 18–25 nucleotides in length. MiRNAs bind to the 3′-untranslated region (UTR) of their target genes, typically resulting in gene silencing by triggering the degradation of the target mRNA. MiRNAs are promising biomarkers for the cancer risk and prognosis due to its stability and functionality (Bartel, 2004). Multiple miRNA signatures for RCC tumorigenesis have been generated by performing quantitative real-time PCR, microarray, or next-generation sequencing (Jung et al, 2009; White et al, 2011; Osanto et al, 2012). Recently, The Cancer Genome Atlas (TCGA) has completed genome-wide profiling of miRNAs in samples of various types of cancer, including ccRCC. A signature based on levels of miR-10b, miR-21, miR-204, miR-30a, miR-143, and let-7a has been reported by TCGA, which may distinguish ccRCC patients with favourable survival (Cancer Genome Atlas Research N, 2013). High levels of miR-21 were correlated with worse survival in another study (Faragalla et al, 2012). Knowledge regarding the role of miRNAs in relation to RCC recurrence is still limited. One study reported that levels of miR-143, miR-26a, miR-145, miR-10b, miR-195, miR-126, and miR-127 were decreased in RCC patients with recurrence (Slaby et al, 2012). In another study, a 4-miRNA signature was generated consisting of miR-10b, miR-139-5p, miR-130b, and miR-199b-5p (Wu et al, 2012). However, the sample sizes of these previous studies have been relatively small.
In this study, we set out to accomplish two aims. First, identify a miRNA signature for tumorigenesis through a multi-stage design to provide a better insight of miRNAs dysregulated in ccRCC. Second, identify miRNAs associated with ccRCC recurrence and explore their associations with obesity.
A total of 135 MD Anderson ccRCC patients were involved in the present study. The details of the study population have been reported previously (Clague et al, 2009). There were no age, sex, ethnicity, or cancer-stage restrictions on recruitment. Patient demographic variables, tobacco and alcohol use history, weight and height to calculate body mass index (BMI), and medical history were obtained by in-person interview. For smoking history, a never smoker was defined as an individual who had never smoked or had smoked <100 cigarettes. Those subjects who had quit smoking >12 months before diagnosis/recruitment were considered former smokers. Clinical information was abstracted from patient medical records, including clinical stage, grade, comorbidities, tumour size, pathological stage, histology, treatment, and clinical outcomes. We also collected tumour and adjacent normal tissues from a subset of patients (Hildebrandt et al, 2012). We utilised 32 tumour–normal pairs and 32 unpaired tumour samples from 64 patients in the discovery. Tumour–normal pairs were collected from all 71 patients in the validation. External independent data set consisted of 68 ccRCC tumour–normal pairs was downloaded from the TCGA portal (https://tcga-data.nci.nih.gov/tcga/).
The study flowchart is shown in Figure 1. For tumour–normal comparisons, three data sets consisting of data generated from 32, 68, and 71 pairs of ccRCC tumour and adjacent normal tissues were used. For testing the association with ccRCC recurrence, 64 (31 recurrence vs 33 non-recurrence) and 71 tumour samples (29 recurrence vs 42 non-recurrence) were utilised in the discovery and validation phases, respectively. ccRCC patients without recurrence were frequency matched to patients with recurrence by age, sex, and clinical stage (I and II/III and IV) in the validation. The study protocol was approved by the MD Anderson Cancer Center Institutional Review Board. All participated patients provided written informed consent.
Total RNA was extracted from 32 pairs of ccRCC tumour and adjacent normal tissues, and 32 unpaired ccRCC tumour tissues. Total RNA was isolated using the mirVana RNA Isolation Kit (Life Technologies, Gaithersburg, MD, USA). Labelled cDNA was synthesised, amplified, and purified from 300ng total RNA using the TotalPrep RNA Amplification Kit (Life Technologies). Each sample was then hybridised to Human-6 v2 Expression BeadChips and read using the BeadStation 500 scanner (Illumina, San Diego, CA, USA) to generate gene expression data (Hildebrandt et al, 2012). MiRNAs were profiled using Illumina Sentrix Array Matrix 96-well MicroRNA Expression Profiling Assays. After quality control (calling rate <80% and excluding non-mature miRNAs), 300 mature human miRNAs remained for the final analyses.
Same protocols for total RNA isolation, reverse transcription, and PreAmp were used for the 71 pairs of ccRCC tumour and adjacent normal tissues in the validation.
There were two parts of validation for our study. First, for tumour–normal comparisons, candidate miRNAs with significant differential expressions between 32 paired ccRCC tumour–normal tissues from the discovery (P<0.01), and the association was also significant in the same direction in the TCGA data set (>200 reads per million miRNA mapped, P<0.01) were selected. For the association of recurrence, significant miRNAs in univariable Cox regression (P<0.01) were selected for the analysis in the validation set.
Selected miRNAs were measured using high-throughput BioMark HD Real-Time PCR system (Fluidigm, South San Francisco, CA, USA). Briefly, reverse transcription was carried out as described above using pooled miRNA primers with 150ng of total input RNA. Pre-amplification was performed with pooled Taqman assay. PCR products were cleaned up using an enzymatic digestion approach by Exonuclease I (New England Biolabs, Ipswich, MA, USA). After pre-amplification, a 5μl sample mixture was prepared for each sample. The IFC controller HX (Fluidigm) was used to distribute the sample mix and assay mix from the loading inlets into the 96.96 Dynamic array reaction chips. After loading, the chip was placed in the BioMark instrument for real-time PCR at 95°C for 10min, followed by 40 cycles at 95°C for 15s and 60°C for 1min. Data were analysed with Real-Time PCR Analysis Software in the BioMark instrument (Fluidigm).
Each PCR reaction was done in duplicate and the mean of cycle threshold (Ct) was calculated. Small nuclear RNAs U44 were used as internal control for input normalisation. The mean Ct value of each sample was normalised to the averaged expression of U44 snRNA and then subjected to analysis with 2−ΔΔCt method. Data were set to missing from further analysis if one of the following criteria were fulfilled: (1) the generated duplicated Ct values with over one cycle variance; (2) samples with a Ct value >35; and (3) miRNAs with a detection rate <80%.
We compiled a list of obesity-related miRNAs (N=75) from an online database (Kunej et al, 2013). Sixty nine of 75 miRNAs were covered in our miRNA microarray. We compiled a list of obesity-related genes according to four sources: (1) bioinformatics tool Text-mined Hypertension, Obesity, and Diabetes Candidate Gene Database (Dai et al, 2013). We restricted genes to those reported by three or more studies and further reviewed these genes in details. Two-hundred and sixteen candidate genes remained in further analysis. (2) Online database of obesity-related genes: integratomics TIME (Kunej et al, 2013). (3) Obesity-relevant pathways selected from Biocarta, KEGG and Reactome pathway databases. Fifteen pathways included in our list were: adipocytokine signalling pathway (KEGG, 67 genes), type II diabetes mellitus (KEGG, 47 genes), insulin signalling pathway (KEGG, 137 genes), insulin signalling pathway (BioCarta, 22 genes), IGF-1 signalling pathway (BioCarta, 21 genes), leptin pathway (BioCarta, 11 genes), PPAR signalling pathway (KEGG, 69 genes), metabolism of lipids and lipoproteins (Reactome, 478 genes), peroxisomal lipid metabolism (Reactome, 21 genes), P53-hypoxia pathway (BioCarta, 23 genes), mTOR signalling pathway (KEGG, 52 genes), mTOR pathway (BioCarta, 23 genes), energy-dependent regulation of mTOR by LKB1-AMPK (Reactome, 18 genes), oxidative phosphorylation (KEGG, 135 genes), and mitochondrial fatty acid beta oxidation (Reactome, 14 genes). (4) Genes close to GWAS confirmed loci for BMI or obesity. We downloaded the list consisted of 43 studies from a Catalog of Published Genome-Wide Association Studies (http://www.genome.gov/gwastudies/). The keywords used for searching were: BMI, obesity, obesity (early-onset extreme), obesity (extreme), BMI (interaction), adiposity, fat body mass, and weight. We included both upstream and downstream genes closest to the SNP if it is located in the intergenic regions. Loci with genome-wide significant SNPs (P<5 × 10−8) were eligible to be studied. One hundred genes were included after duplicates being removed. In total, we compiled a list consisted of 2051 obesity-related genes.
We used a web-based analytical tool: ToppMiR (Wu et al, 2014) (https://toppmir.cchmc.org/) to search putative miRNA targets for obesity-related miRNAs. It searches for any evidence of putative target genes for miRNAs through multiple prediction tools, including: mirSVR, miRTarbase, MsigDB, TargetScan, miRecords, PicTar and PITA.
MiRNAs that had been detected in <80% of samples were excluded. Continuous host characteristics were analysed using Student's t-test, whereas categorical variables were analysed using Pearson's χ2-test. miRNA array data were quantile normalised (Bolstad et al, 2003) and log2 transformed in the discovery set, whereas reads per million miRNA-mapped values were obtained from TCGA miRNA-seq data and CT values generated from Fluidigm platform were used. For tumour–normal comparisons, paired Student's t-test was performed to compare levels of normalised miRNAs between ccRCC tumour and adjacent normal pairs. Fold change was calculated as normalised miRNA in tumour samples divided by normalised miRNA in paired adjacent normal tissues for both the TCGA and our miRNA mircoarray data. In the validation, fold change was calculated using 2−ΔΔCt method (2−ΔΔCt=2−((CtmiRNA in tumour−CtRNU44 in tumour)−(CtmiRNA in normal−CtRNU44 in normal))). For discovery phase analysis of recurrence, the normalised levels of miRNAs were directly compared between the patients with and without recurrence. In the validation, relative quantification (RQ) was calculated as 2−[(CtmiRNA−CtmiRNA mean)−(CtRNU44−CtRNU44 mean)] in tumour tissues using Ct value of each miRNA. Levels (RQ) of miRNAs were dichotomised according to the median level in patients without recurrence. The associations between 300 miRNAs and obesity were tested in the adjacent normal tissues in the discovery phase, using univariable and multivariable linear regression model. The adjustment included age, sex, smoking status, and history of hypertension. Pearson's correlations were conducted to compare the correlations of fold changes of miRNAs identified for the tumorigenesis signature between data sets. Kaplan–Meier survival curves and Cox regression models were performed to test dichotomised miRNAs in association with recurrence-free survival.
To test whether the identified miRNAs were enriched to be obesity related, we calculated the binomial probability for observing the exact number or more miRNAs that are obesity related in the present study, given the assumption that 23 miRNA are obesity related (P<0.05 in the multivariable linear regression model and listed in the Integratomics TIME database; Supplementary Table S5) among 300 miRNAs tested (probability of event is 23/300=0.077). Pearson's correlations were conducted for the predicted miRNA–mRNA pairs using our ccRCC miRNA and mRNA microarray data from our discovery phase (Nsample=64) and the TCGA ccRCC level 3 tumour miRNA and mRNA-seq data (Nsample=236, both Pearson's correlation P<0.01). To maintain consistency, the TCGA mRNA data were also log2 transformed. All tests were performed using STATA 13.0 (College Station, TX, USA) and R 3.01. The heatmaps were generated using GenePattern (v3.1, Broad Institute, Cambridge, MA, USA). The hierarchical clustering algorithm with Pearson's correlation as column distance measure and pairwise average linkage as clustering method were used.
The study flowchart is shown in Figure 1. Host characteristics of 135 ccRCC patients recruited at MD Anderson are shown (Supplementary Table S1). The proportions of male (80.6% vs 39.4%), current smoker (22.6% vs 12.1%), and late-stage patients (III and IV, 71.0% vs 21.2%) are higher in the 31 patients with recurrence, compared with 33 patients without recurrence in the discovery. We frequency matched 29 patients with recurrence to 42 patients without by age, sex, and clinical stage in the validation.
One-hundred seventy miRNAs were dysregulated in the 32 ccRCC tumour samples compared with their adjacent normal samples (P<0.01; Supplementary Table S2). The fold change ranged from 0.14 to 10.4 with miR-514a-3p being most downregulated and miR-122a-5p being most upregulated.
Significant miRNAs identified in the discovery phase were tested using data from 68 TCGA ccRCC tumour–normal pairs. Thirty three of 170 miRNAs were significantly dysregulated with the same up/downregulated trend in tumours (P<0.01; Supplementary Table S2). The fold change ranged from 0.03 to 13.6 with miR-141-3p being most downregulated and miR-155-5p being most upregulated. The scatter plot showed the correlations between fold changes of expressions of the 33 candidate miRNAs in the discovery and TCGA validation data sets (Supplementary Figure S1A). The correlation coefficient was 0.96 (P<0.001).
We further tested these 33 miRNAs in a third independent data set that consisted of 71 ccRCC tumour–normal pairs. The final signature consisted of 13 downregulated and 4 upregulated miRNAs across the three independent data sets (Figure 2; Supplementary Table S2). The correlation between fold changes of expressions of these 17 validated miRNAs between discovery and second validation data sets was highly significant (r=0.90, P<0.001; Supplementary Figure S1B). In clustering analysis using the 17-miRNA signature, it concordantly classified >95% of samples (Figure 2; Supplementary Figures S2 and S3). Interestingly, six of these miRNAs were identified to be obesity related that was significant in the enrichment test (6 out of 17, binomial probability=0.001; Supplementary Table S5).
The results of univariable Cox regression identified three upregulated and five downregulated miRNAs in patients with recurrence (dichotomised, P<0.01; Supplementary Table S3). The most significant upregulated and downregulated miRNA in patients with recurrence was miR-365a-3p (hazard ratio (HR)=0.32, 95% CI=0.15–0.68, P=0.003) and miR-204-5p (HR=3.15, 95% CI=1.47–6.76, P=0.003), respectively. The miRNA expressions were also modelled as continuous predictors which showed persistent associations.
Two of the eight miRNAs were significantly associated with ccRCC recurrence in the validation set (dichotomised, P<0.01; Table 1). In the univariable model, low levels of miR-204-5p was associated with a significantly increased three-fold risk of ccRCC recurrence (HR=3.01, 95% CI=1.34–6.80, P=0.008). Similar results were found for miR-139-5p (HR=2.79, 95% CI=1.29–6.03, P=0.009). The associations remained significant and the strengths subtly changed with the adjustment of covariates (Supplementary Table S4). Consistent associations were observed when modelling miRNAs levels in continuous form (Table 1; Supplementary Table S4). Striking reduction in recurrence-free median survival time, from >107.2 months to 46.2.0 months (Plog-rank=0.002) and 62.9 months to 25.0 months (Plog-rank=0.006) was observed for low miR-204-5p levels in the discovery and validation set, respectively. Similarly, the decrease was observed from >113.5 months to 39.5 months (Plog-rank=0.002) and 70.3 months to 25.0 months (Plog-rank=0.006) for low miR-139-5p levels in the discovery and validation set, respectively (Supplementary Figure S4). Risk score derived from miR-139-5p and miR-204-5p was able to stratify our study population into high-risk, intermediate and low-risk groups (Figure 3). The increasing risk of recurrence with higher-risk score was consistently observed in the discovery and validation sets in the multivariable Cox regressions (Pfor trend<0.05; Table 2).
Obesity-related miR-204-5p (β=5.64, P<0.001; Supplementary Table S5) was associated with both ccRCC tumorigenesis (Figure 2) and recurrence. The test of enrichment was not significant (one out of two, binomial probability=0.148). Of 2051 obesity-related genes (Supplementary Table S6), 406 were predicted to be regulated by miR-204-5p. We further tested the correlation between miR-204-5p expression and gene expression levels for each of the 395 genes that were available in the same samples (Supplementary Table S7). Eighteen pairs exhibited significant inverse Pearson correlations (P<0.01) with receptor tyrosine kinase-like orphan receptor 2 (ROR2) being most significant (r=−0.53, P=1.46 × 10−6). We conducted the same correlation analysis for the genes in the TCGA data and 13 miRNA-gene pairs remained highly statistically significant (Table 3). The P-value reached 10−22 for insulin-like growth factor 2 (IGF2) mRNA-binding protein 2 (IGF2BP2) that was most significant in TCGA data set.
In the present study, several miRNAs identified were obesity related. Most interestingly, obesity-related miR-204-5p was associated with both ccRCC tumorigenesis and recurrence. Furthermore, miR-204-5p was consistently inversely correlated with 13 obesity-related genes in two independent data sets.
Many miRNAs in our signature for tumorigenesis overlap with findings from the previous studies. For example, overexpressed miR-34a, miR-155, miR-210, and under-expressed miR-10a/b, miR-30a, miR-141, miR-200a/b/c, miR-204, miR-500a, and miR-532 were reported by Jung et al (2009) and Juan et al (2010). Our findings support the role of these miRNAs played in the development of the disease. Moreover, the highly significant correlation coefficient between fold changes of expressions of validated dysregulated miRNAs across different data sets demonstrated the robustness of our findings. With the incorporation of a three-stage design that strengthens our findings, we further refined miRNA profiles of ccRCC tumorigenesis.
Both miR-204-5p and miR-139-5p were identified as most influential miRNAs for ccRCC pathogenesis in the network analysis (Butz et al, 2014). Interestingly, low expression of miR-204 was the key feature (together with high levels of miR-21) of discriminatory miRNA group 2 identified by original TCGA analysis, in which the patients had worst prognosis compared with other groups (Cancer Genome Atlas Research N, 2013). However, limited studies have focused on the role of these miRNAs in the development of ccRCC recurrence. Our observation of association between decreased level of miR-204-5p and shorter RCC recurrence-free survival is novel. In a previous study, lower levels of miR-204-5p were observed in RCC patients who progressed to metastatic disease compared with those without progression. However, it is not clear what covariates were adjusted in their analyses (Gowrishankar et al, 2014). It has been suggested that miR-204-5p may function as a tumour suppressor. Higher expressions of miR-204-5p have been observed in breast and gastric cancer tissues obtained from patients free from disease metastasis (Li et al, 2014). In vitro studies have shown that overexpression of miR-204-5p could markedly suppress cell migration and invasion in different cell lines (Chung et al, 2012; Qiu et al, 2013; Ying et al, 2013). Previous studies also have reported the tumour suppressive function of miR-139-5p in cancer recurrence or metastasis. In one study, reduced level of miR-139-5p was found in tissues obtained from ccRCC patients with recurrence after nephrectomy (Slaby et al, 2012). However, miR-139-5p was not selected for validation. In other studies, it was consistently downregulated in ccRCC metastatic samples and oesophageal squamous cell carcinoma tissues obtained from patients with lymph node metastasis (Wu et al, 2012; Liu et al, 2013). miR-139-5p was also found to be associated with ccRCC survival in some studies but not all (Osanto et al, 2012; Wu et al, 2012). Our findings further support the role of miR-139-5p as a tumour suppressor in cancers, including ccRCC, although additional studies are required to validate our findings.
Interestingly, several miRNAs identified in this study were obesity related. One caveat is that our study was not ideal to identify or confirm whether the associated miRNAs are obesity related. The obesity-related miRNAs were defined based on association tests in our own samples with relatively small sample size and large number of comparisons. Therefore, we can only make a suggestive inference. Studies using samples collected from healthy subjects and with larger sample size would be more appropriate to achieve this goal. However, we further used a database to consolidate our observations.
Obesity has been related to later recurrence and favourable survival in RCC patients (Yu et al, 1991; Kamat et al, 2004; Parker et al, 2006). Studies showed that comparing with those with normal BMI, obese RCC patients had >50% reduced risk of recurrence and longer survival (Yu et al, 1991; Parker et al, 2006). However, the exact mechanisms involved in these processes have been elusive. Therefore, our exploratory analysis for miR-204-5p and obesity deserves further investigations. Overexpressed miR-204-5p was shown to promote adipocyte differentiation and increase lipid droplet accumulation in mesenchymal stem cell lines (Huang et al, 2010; Alexander et al, 2011). In this study, the expression of miR-204-5p was also positively associated with BMI. Thus, miR-204-5p may contribute to ccRCC recurrence through its link with obesity. In addition, we observed several putative target genes regulated by miR-204-5p, which includes IGF2BP2. IGF2BP2 binds to the 5′-UTR of the IGF2 mRNA transcripts and subsequently represses the translational process (Nielsen et al, 1999). Increasing evidence supports its association with obesity and cancer risk (Sandhu et al, 2003; Livingstone, 2013). Several GWAS have also linked a common variant located in IGF2BP2 (rs4402960) to risk of type 2 diabetes (Saxena et al, 2007; Scott et al, 2007; Zeggini et al, 2007).
Another gene of interest is ADAM12, which is also involved in the IGF receptor signalling pathway (Kveiborg et al, 2008). The overexpression of ADAM12 has been reported in various cancers (Wewer et al, 2005). Its protease and adhesion activities, stimulation on cell proliferation, and increased resistance to apoptosis may contribute to the progression of tumours (Kveiborg et al, 2005, 2008). Therefore, we hypothesised that miR-204-5p may serve as an intermediate between obesity and recurrence, potentially through IGF signalling (Supplementary Figure S5).
Our study has several strengths. The sample size of the present study is relatively large in comparison to other studies. Importantly, independent data sets were used to validate our findings. The miRNAs remained significant in the multivariable model, which indicates their independent prognostic value. In addition, to increase the likelihood that the predicted miRNA–mRNA relationships are plausible, we used a prediction tool that integrates multiple prediction algorithms and evaluated the correlations in two independent data sets. We also recognised several limitations of our study. Although the findings were validated in independent internal/external data sets, the possibility of false positives still exists. In addition, the correlation tests were exploratory that laboratory-based experiments are required to validate the putative miRNA–mRNA relationships. Another limitation is that our data set is not ideal to investigate the relationship between miRNAs and obesity. Finally, the curated obesity-related gene set includes genes having various biological functions that the genes are not ‘obesity related' only. However, there is no well-defined obesity-related gene set that could be found in any commonly used databases, including BioCarta, KEGG, Reactome, and GO.
Our findings may have clinical implications in predicting ccRCC patients who are at higher risk of recurrence and provide new insights of mechanisms involved in the link between obesity and ccRCC recurrence. However, more efforts are warranted to establish the exact biological mechanisms for the interplay of obesity, miRNAs and their targeted genes, and ccRCC recurrence.
This work was supported in part by the National Institutes of Health (grant R01 CA170298), and the Center for Translational and Public Health Genomics, Duncan Family Institute for Cancer Prevention, The University of Texas MD Anderson Cancer Center.
The authors declare no conflict of interest.
Supplementary Information accompanies this paper on British Journal of Cancer website (http://www.nature.com/bjc)
This work is published under the standard license to publish agreement. After 12 months the work will become freely available and the license terms will switch to a Creative Commons Attribution-NonCommercial-Share Alike 4.0 Unported License.