Search tips
Search criteria

Results 1-25 (1044468)

Clipboard (0)

Related Articles

1.  Gene Expression Classification of Colon Cancer into Molecular Subtypes: Characterization, Validation, and Prognostic Value 
PLoS Medicine  2013;10(5):e1001453.
Colon cancer (CC) pathological staging fails to accurately predict recurrence, and to date, no gene expression signature has proven reliable for prognosis stratification in clinical practice, perhaps because CC is a heterogeneous disease. The aim of this study was to establish a comprehensive molecular classification of CC based on mRNA expression profile analyses.
Methods and Findings
Fresh-frozen primary tumor samples from a large multicenter cohort of 750 patients with stage I to IV CC who underwent surgery between 1987 and 2007 in seven centers were characterized for common DNA alterations, including BRAF, KRAS, and TP53 mutations, CpG island methylator phenotype, mismatch repair status, and chromosomal instability status, and were screened with whole genome and transcriptome arrays. 566 samples fulfilled RNA quality requirements. Unsupervised consensus hierarchical clustering applied to gene expression data from a discovery subset of 443 CC samples identified six molecular subtypes. These subtypes were associated with distinct clinicopathological characteristics, molecular alterations, specific enrichments of supervised gene expression signatures (stem cell phenotype–like, normal-like, serrated CC phenotype–like), and deregulated signaling pathways. Based on their main biological characteristics, we distinguished a deficient mismatch repair subtype, a KRAS mutant subtype, a cancer stem cell subtype, and three chromosomal instability subtypes, including one associated with down-regulated immune pathways, one with up-regulation of the Wnt pathway, and one displaying a normal-like gene expression profile. The classification was validated in the remaining 123 samples plus an independent set of 1,058 CC samples, including eight public datasets. Furthermore, prognosis was analyzed in the subset of stage II–III CC samples. The subtypes C4 and C6, but not the subtypes C1, C2, C3, and C5, were independently associated with shorter relapse-free survival, even after adjusting for age, sex, stage, and the emerging prognostic classifier Oncotype DX Colon Cancer Assay recurrence score (hazard ratio 1.5, 95% CI 1.1–2.1, p = 0.0097). However, a limitation of this study is that information on tumor grade and number of nodes examined was not available.
We describe the first, to our knowledge, robust transcriptome-based classification of CC that improves the current disease stratification based on clinicopathological variables and common DNA markers. The biological relevance of these subtypes is illustrated by significant differences in prognosis. This analysis provides possibilities for improving prognostic models and therapeutic strategies. In conclusion, we report a new classification of CC into six molecular subtypes that arise through distinct biological pathways.
Please see later in the article for the Editors' Summary
Editors' Summary
Cancer of the large bowel (colorectal cancer) is the third most common cancer in men and the second most common cancer in women worldwide. Despite recent advances in the screening, diagnosis, and treatment of colorectal cancer, an estimated 608,000 people die every year from this form of cancer—8% of all cancer deaths. The prognosis and treatment options for colorectal cancer depend on five pathological stages (0–IV), each of which has a different treatment option and five year survival rate, so it is important that the stage is correctly identified. Unfortunately, pathological staging fails to accurately predict recurrence (relapse) in patients undergoing surgery for localized colorectal cancer, which is a concern, as 10%–20% of patients with stage II and 30%–40% of those with stage III colorectal cancer develop recurrence.
Why Was This Study Done?
Previous studies have investigated whether there are any possible gene expression profiles (identified through microarray techniques) that can help predict prognosis of colorectal cancer, but so far, there have been no firm conclusions that can aid clinical practice. In this study, the researchers used genetic information from a French multicenter study to identify a standard, reproducible molecular classification based on gene expression analysis of colorectal cancer. The authors also assessed whether there were any associations between the identified molecular subtypes and clinical and pathological factors, common DNA alterations, and prognosis.
What Did the Researchers Do and Find?
The researchers used genetic information from a cohort of 750 patients with stage I to IV colorectal cancer who underwent surgery between 1987 and 2007 in seven centers in France. The researchers identified relevant clinical and pathological staging information for each patient from the medical records and calculated recurrence-free survival (the time from surgery to the first recurrence) for patients with stage II or III disease. In the genetic analysis, 566 tumor samples were suitable—443 were used in a discovery set, to create the classification, and the remainder were used in a validation set, to test the classification. The researchers also used information from eight public datasets to validate their findings.
Using these methods, the researchers classified the colon cancer samples into six molecular subtypes (based on gene expression data) and, on further analysis and validation, were able to distinguish the main biological characteristics and deregulated pathways associated with each subtype. Importantly, the researchers found that that these six subtypes were associated with distinct clinical and pathological characteristics, molecular alterations, specific gene expression signatures, and deregulated signaling pathways. In the prognostic analysis based on recurrence-free survival, the researchers found that patients whose tumors were classified in one of two clusters (C4 and C6) had poorer recurrence-free survival than the other patients.
What Do These Findings Mean?
These findings suggest that it is possible to classify colorectal cancer into six robust molecular subtypes that might help identify new prognostic subgroups and could provide a basis for developing robust prognostic genetic signatures for stage II and III colorectal cancer and for identifying specific markers for the different subtypes that might be targets for future drug development. However, as this study was retrospective and did not include some known predictors of colorectal cancer prognosis, such as tumor grade and number of nodes examined, the significance and robustness of the prognostic classification requires further confirmation with large prospective patient cohorts.
Additional Information
Please access these websites via the online version of this summary at
The American Cancer Society provides information about colorectal cancer and also about how colorectal cancer is staged
The US National Cancer Institute also provides information on colon and rectal cancer and colon cancer stages
PMCID: PMC3660251  PMID: 23700391
2.  Integrated metabolome and transcriptome analysis of the NCI60 dataset 
BMC Bioinformatics  2011;12(Suppl 1):S36.
Metabolite profiles can be used for identifying molecular signatures and mechanisms underlying diseases since they reflect the outcome of complex upstream genomic, transcriptomic, proteomic and environmental events. The scarcity of publicly accessible large scale metabolome datasets related to human disease has been a major obstacle for assessing the potential of metabolites as biomarkers as well as understanding the molecular events underlying disease-related metabolic changes. The availability of metabolite and gene expression profiles for the NCI-60 cell lines offers the possibility of identifying significant metabolome and transcriptome features and discovering unique molecular processes related to different cancer types.
We utilized a combination of analytical methods in the R statistical package to evaluate metabolic features associated with cancer cell lines from different tissue origins, identify metabolite-gene correlations and detect outliers cell lines based on metabolome and transcriptome data. Statistical analysis results are integrated with metabolic pathway annotations as well as COSMIC and Tumorscape databases to explore associated molecular mechanisms.
Our analysis reveals that although the NCI-60 metabolome dataset is quite noisy comparing with microarray-based transcriptome data, it does contain tissue origin specific signatures. We also identified biologically meaningful gene-metabolite associations. Most remarkably, several abnormal gene-metabolite relationships identified by our approach can be directly linked to known gene mutations and copy number variations in the corresponding cell lines.
Our results suggest that integrative metabolome and transcriptome analysis is a powerful method for understanding molecular machinery underlying various pathophysiological processes. We expect the availability of large scale metabolome data in the coming years will significantly promote the discovery of novel biomarkers, which will in turn improve the understanding of molecular mechanism underlying diseases.
PMCID: PMC3044292  PMID: 21342567
3.  Insight in Genome-Wide Association of Metabolite Quantitative Traits by Exome Sequence Analyses 
PLoS Genetics  2015;11(1):e1004835.
Metabolite quantitative traits carry great promise for epidemiological studies, and their genetic background has been addressed using Genome-Wide Association Studies (GWAS). Thus far, the role of less common variants has not been exhaustively studied. Here, we set out a GWAS for metabolite quantitative traits in serum, followed by exome sequence analysis to zoom in on putative causal variants in the associated genes. 1H Nuclear Magnetic Resonance (1H-NMR) spectroscopy experiments yielded successful quantification of 42 unique metabolites in 2,482 individuals from The Erasmus Rucphen Family (ERF) study. Heritability of metabolites were estimated by SOLAR. GWAS was performed by linear mixed models, using HapMap imputations. Based on physical vicinity and pathway analyses, candidate genes were screened for coding region variation using exome sequence data. Heritability estimates for metabolites ranged between 10% and 52%. GWAS replicated three known loci in the metabolome wide significance: CPS1 with glycine (P-value  = 1.27×10−32), PRODH with proline (P-value  = 1.11×10−19), SLC16A9 with carnitine level (P-value  = 4.81×10−14) and uncovered a novel association between DMGDH and dimethyl-glycine (P-value  = 1.65×10−19) level. In addition, we found three novel, suggestively significant loci: TNP1 with pyruvate (P-value  = 1.26×10−8), KCNJ16 with 3-hydroxybutyrate (P-value  = 1.65×10−8) and 2p12 locus with valine (P-value  = 3.49×10−8). Exome sequence analysis identified potentially causal coding and regulatory variants located in the genes CPS1, KCNJ2 and PRODH, and revealed allelic heterogeneity for CPS1 and PRODH. Combined GWAS and exome analyses of metabolites detected by high-resolution 1H-NMR is a robust approach to uncover metabolite quantitative trait loci (mQTL), and the likely causative variants in these loci. It is anticipated that insight in the genetics of intermediate phenotypes will provide additional insight into the genetics of complex traits.
Author Summary
Human metabolic individuality is under strict control of genetic and environmental factors. In our study, we aimed to find the genetic determinants of circulating molecules in sera of large set of individuals representing the general population. First, we performed a hypothesis-free genome wide screen in this population to identify genetic regions of interest. Our study confirmed four known gene metabolite connections, but also pointed to four novel ones. Genome-wide screens enriched for common intergenic variants may miss causal genetic variations directly changing the protein sequence. To investigate this further, we zoomed into regions of interest and tested whether the association signals obtained in the first stage were direct, or whether they represent causal variations, which were not captured in the initial panel. These subsequent tests showed that protein coding and regulatory variations are involved in metabolite levels. For two genomic regions we also found that genes harbour more than one causal variant influencing metabolite levels independent of each other. We also observed strong connection between markers of cardio-metabolic health and metabolites. Taken together, our novel loci are of interest for further research to investigate the causal relation to for instance type 2 diabetes and cardiovascular disease.
PMCID: PMC4287344  PMID: 25569235
4.  Exome Sequencing and Genetic Testing for MODY 
PLoS ONE  2012;7(5):e38050.
Genetic testing for monogenic diabetes is important for patient care. Given the extensive genetic and clinical heterogeneity of diabetes, exome sequencing might provide additional diagnostic potential when standard Sanger sequencing-based diagnostics is inconclusive.
The aim of the study was to examine the performance of exome sequencing for a molecular diagnosis of MODY in patients who have undergone conventional diagnostic sequencing of candidate genes with negative results.
Research Design and Methods
We performed exome enrichment followed by high-throughput sequencing in nine patients with suspected MODY. They were Sanger sequencing-negative for mutations in the HNF1A, HNF4A, GCK, HNF1B and INS genes. We excluded common, non-coding and synonymous gene variants, and performed in-depth analysis on filtered sequence variants in a pre-defined set of 111 genes implicated in glucose metabolism.
On average, we obtained 45 X median coverage of the entire targeted exome and found 199 rare coding variants per individual. We identified 0–4 rare non-synonymous and nonsense variants per individual in our a priori list of 111 candidate genes. Three of the variants were considered pathogenic (in ABCC8, HNF4A and PPARG, respectively), thus exome sequencing led to a genetic diagnosis in at least three of the nine patients. Approximately 91% of known heterozygous SNPs in the target exomes were detected, but we also found low coverage in some key diabetes genes using our current exome sequencing approach. Novel variants in the genes ARAP1, GLIS3, MADD, NOTCH2 and WFS1 need further investigation to reveal their possible role in diabetes.
Our results demonstrate that exome sequencing can improve molecular diagnostics of MODY when used as a complement to Sanger sequencing. However, improvements will be needed, especially concerning coverage, before the full potential of exome sequencing can be realized.
PMCID: PMC3360646  PMID: 22662265
5.  An expression module of WIPF1-coexpressed genes identifies patients with favorable prognosis in three tumor types 
Wiskott–Aldrich syndrome (WAS) predisposes patients to leukemia and lymphoma. WAS is caused by mutations in the protein WASP which impair its interaction with the WIPF1 protein. Here, we aim to identify a module of WIPF1-coexpressed genes and to assess its use as a prognostic signature for colorectal cancer, glioma, and breast cancer patients. Two public colorectal cancer microarray data sets were used for discovery and validation of the WIPF1 co-expression module. Based on expression of the WIPF1 signature, we classified more than 400 additional tumors with microarray data from our own experiments or from publicly available data sets according to their WIPF1 signature expression. This allowed us to separate patient populations for colorectal cancers, breast cancers, and gliomas for which clinical characteristics like survival times and times to relapse were analyzed. Groups of colorectal cancer, breast cancer, and glioma patients with low expression of the WIPF1 co-expression module generally had a favorable prognosis. In addition, the majority of WIPF1 signature genes are individually correlated with disease outcome in different studies. Literature gene network analysis revealed that among WIPF1 co-expressed genes known direct transcriptional targets of c-myc, ESR1 and p53 are enriched. The mean expression profile of WIPF1 signature genes is correlated with the profile of a proliferation signature. The WIPF1 signature is the first microarray-based prognostic expression signature primarily developed for colorectal cancer that is instrumental in other tumor types: low expression of the WIPF1 module is associated with better prognosis.
Electronic supplementary material
The online version of this article (doi:10.1007/s00109-009-0467-y) contains supplementary material, which is available to authorized users.
PMCID: PMC2688022  PMID: 19399471
Colorectal cancer; WIPF1; Prognosis; Expression signature; Microarray
6.  Genome-Wide Association Study of Metabolic Traits Reveals Novel Gene-Metabolite-Disease Links 
PLoS Genetics  2014;10(2):e1004132.
Metabolic traits are molecular phenotypes that can drive clinical phenotypes and may predict disease progression. Here, we report results from a metabolome- and genome-wide association study on 1H-NMR urine metabolic profiles. The study was conducted within an untargeted approach, employing a novel method for compound identification. From our discovery cohort of 835 Caucasian individuals who participated in the CoLaus study, we identified 139 suggestively significant (P<5×10−8) and independent associations between single nucleotide polymorphisms (SNP) and metabolome features. Fifty-six of these associations replicated in the TasteSensomics cohort, comprising 601 individuals from São Paulo of vastly diverse ethnic background. They correspond to eleven gene-metabolite associations, six of which had been previously identified in the urine metabolome and three in the serum metabolome. Our key novel findings are the associations of two SNPs with NMR spectral signatures pointing to fucose (rs492602, P = 6.9×10−44) and lysine (rs8101881, P = 1.2×10−33), respectively. Fine-mapping of the first locus pinpointed the FUT2 gene, which encodes a fucosyltransferase enzyme and has previously been associated with Crohn's disease. This implicates fucose as a potential prognostic disease marker, for which there is already published evidence from a mouse model. The second SNP lies within the SLC7A9 gene, rare mutations of which have been linked to severe kidney damage. The replication of previous associations and our new discoveries demonstrate the potential of untargeted metabolomics GWAS to robustly identify molecular disease markers.
Author Summary
The concentrations of small molecules known as metabolites, are subject to tight regulation in all organisms. Collectively, the metabolite concentrations make up the metabolome, which differs amongst individuals as a function of their environment and genetic makeup. In our study, we have further developed an untargeted approach to identify genetic factors affecting human metabolism. In this approach, we first identify all genetic variants that correlate with any of the measured metabolome features in a large set of individuals. For these variants, we then compute a profile of significance for association with all features, generating a signature that facilitates the expert or computational identification of the metabolite whose concentration is most likely affected by the genetic variant at hand. Our study replicated many of the previously reported genetically driven variations in human metabolism and revealed two new striking examples of genetic variations with a sizeable effect on the urine metabolome. Interestingly, in these two gene-metabolite pairs both the gene and the affected metabolite are related to human diseases – Crohn's disease in the first case, and kidney disease in the second. This highlights the connection between genetic predispositions, affected metabolites, and human health.
PMCID: PMC3930510  PMID: 24586186
7.  A novel untargeted metabolomics correlation-based network analysis incorporating human metabolic reconstructions 
BMC Systems Biology  2013;7:107.
Metabolomics has become increasingly popular in the study of disease phenotypes and molecular pathophysiology. One branch of metabolomics that encompasses the high-throughput screening of cellular metabolism is metabolic profiling. In the present study, the metabolic profiles of different tumour cells from colorectal carcinoma and breast adenocarcinoma were exposed to hypoxic and normoxic conditions and these have been compared to reveal the potential metabolic effects of hypoxia on the biochemistry of the tumour cells; this may contribute to their survival in oxygen compromised environments. In an attempt to analyse the complex interactions between metabolites beyond routine univariate and multivariate data analysis methods, correlation analysis has been integrated with a human metabolic reconstruction to reveal connections between pathways that are associated with normoxic or hypoxic oxygen environments.
Correlation analysis has revealed statistically significant connections between metabolites, where differences in correlations between cells exposed to different oxygen levels have been highlighted as markers of hypoxic metabolism in cancer. Network mapping onto reconstructed human metabolic models is a novel addition to correlation analysis. Correlated metabolites have been mapped onto the Edinburgh human metabolic network (EHMN) with the aim of interlinking metabolites found to be regulated in a similar fashion in response to oxygen. This revealed novel pathways within the metabolic network that may be key to tumour cell survival at low oxygen. Results show that the metabolic responses to lowering oxygen availability can be conserved or specific to a particular cell line. Network-based correlation analysis identified conserved metabolites including malate, pyruvate, 2-oxoglutarate, glutamate and fructose-6-phosphate. In this way, this method has revealed metabolites not previously linked, or less well recognised, with respect to hypoxia before. Lactate fermentation is one of the key themes discussed in the field of hypoxia; however, malate, pyruvate, 2-oxoglutarate, glutamate and fructose-6-phosphate, which are connected by a single pathway, may provide a more significant marker of hypoxia in cancer.
Metabolic networks generated for each cell line were compared to identify conserved metabolite pathway responses to low oxygen environments. Furthermore, we believe this methodology will have general application within metabolomics.
PMCID: PMC3874763  PMID: 24153255
Metabolomics; Correlation analysis; Network analysis; Cancer; Hypoxia
8.  MLH1-Silenced and Non-Silenced Subgroups of Hypermutated Colorectal Carcinomas Have Distinct Mutational Landscapes 
The Journal of pathology  2013;229(1):99-110.
Approximately 15% of colorectal carcinomas (CRC) exhibit a hypermutated genotype accompanied by high levels of microsatellite instability (MSI-H) and defects in DNA mismatch repair. These tumors, unlike the majority of colorectal carcinomas, are often diploid, exhibit frequent epigenetic silencing of the MLH1 DNA mismatch repair gene, and have a better clinical prognosis. As an adjunct study to The Cancer Genome Atlas consortium that recently analyzed 224 colorectal cancers by whole exome sequencing, we compared the 35 CRC (15.6%) with a hypermutated genotype to those with a non-hypermutated genotype. We found that 22 (63%) of hypermutated CRC exhibited transcriptional silencing of the MLH1 gene, a high frequency of BRAF V600E gene mutations and infrequent APC and KRAS mutations, a mutational pattern significantly different from their non-hypermutated counterparts. However, the remaining 13 (37%) hypermutated CRC lacked MLH1 silencing, contained tumors with the highest mutation rates (“ultramutated” CRC), and exhibited higher incidences of APC and KRAS mutations, but infrequent BRAF mutations. These patterns were confirmed in an independent validation set of 250 exome-sequenced CRC. Analysis of mRNA and microRNA expression signatures revealed that hypermutated CRC with MLH1 silencing had greatly reduced levels of WNT signaling and increased BRAF signaling relative non-hypermutated CRC. Our findings suggest that hypermutated CRC include one subgroup with fundamentally different pathways to malignancy than the majority of CRC. Examination of MLH1 expression status and frequencies of APC, KRAS, and BRAF mutation in CRC may provide a useful diagnostic tool that could supplement the standard microsatellite instability assays and influence therapeutic decisions.
PMCID: PMC3926301  PMID: 22899370
colorectal cancer; microsatellite instability; MLH1; APC; KRAS; BRAF; WNT signaling; mutation rate
9.  Biomedical Impact of Splicing Mutations Revealed through Exome Sequencing 
Molecular Medicine  2011;18(1):314-319.
Splicing is a cellular mechanism, which dictates eukaryotic gene expression by removing the noncoding introns and ligating the coding exons in the form of a messenger RNA molecule. Alternative splicing (AS) adds a major level of complexity to this mechanism and thus to the regulation of gene expression. This widespread cellular phenomenon generates multiple messenger RNA isoforms from a single gene, by utilizing alternative splice sites and promoting different exon–intron inclusions and exclusions. AS greatly increases the coding potential of eukaryotic genomes and hence contributes to the diversity of eukaryotic proteomes. Mutations that lead to disruptions of either constitutive splicing or AS cause several diseases, among which are myotonic dystrophy and cystic fibrosis. Aberrant splicing is also well established in cancer states. Identification of rare novel mutations associated with splice-site recognition, and splicing regulation in general, could provide further insight into genetic mechanisms of rare diseases. Here, disease relevance of aberrant splicing is reviewed, and the new methodological approach of starting from disease phenotype, employing exome sequencing and identifying rare mutations affecting splicing regulation is described. Exome sequencing has emerged as a reliable method for finding sequence variations associated with various disease states. To date, genetic studies using exome sequencing to find disease-causing mutations have focused on the discovery of nonsynonymous single nucleotide polymorphisms that alter amino acids or introduce early stop codons, or on the use of exome sequencing as a means to genotype known single nucleotide polymorphisms. The involvement of splicing mutations in inherited diseases has received little attention and thus likely occurs more frequently than currently estimated. Studies of exome sequencing followed by molecular and bioinformatic analyses have great potential to reveal the high impact of splicing mutations underlying human disease.
PMCID: PMC3324954  PMID: 22160217
10.  Whole-genome reconstruction and mutational signatures in gastric cancer 
Genome Biology  2012;13(12):R115.
Gastric cancer is the second highest cause of global cancer mortality. To explore the complete repertoire of somatic alterations in gastric cancer, we combined massively parallel short read and DNA paired-end tag sequencing to present the first whole-genome analysis of two gastric adenocarcinomas, one with chromosomal instability and the other with microsatellite instability.
Integrative analysis and de novo assemblies revealed the architecture of a wild-type KRAS amplification, a common driver event in gastric cancer. We discovered three distinct mutational signatures in gastric cancer - against a genome-wide backdrop of oxidative and microsatellite instability-related mutational signatures, we identified the first exome-specific mutational signature. Further characterization of the impact of these signatures by combining sequencing data from 40 complete gastric cancer exomes and targeted screening of an additional 94 independent gastric tumors uncovered ACVR2A, RPL22 and LMAN1 as recurrently mutated genes in microsatellite instability-positive gastric cancer and PAPPA as a recurrently mutated gene in TP53 wild-type gastric cancer.
These results highlight how whole-genome cancer sequencing can uncover information relevant to tissue-specific carcinogenesis that would otherwise be missed from exome-sequencing data.
PMCID: PMC4056366  PMID: 23237666
11.  Whole-Exome Sequencing Revealing Somatic NLRP3 Mosaicism in a Patient With Chronic Infantile Neurologic, Cutaneous, Articular Syndrome 
To identify the genetic cause of chronic infantile neurologic, cutaneous, articular syndrome (CINCA syndrome) using whole-exome sequencing in a child who had typical clinical features but who was NLRP3 mutation negative based on conventional Sanger sequencing.
We performed whole-exome sequencing on DNA from peripheral blood, using Illumina TruSeq Exome capture and the HiSeq sequencing platform. Exome data were analyzed in the Galaxy Web-based suite. Whole-exome sequencing findings were confirmed by massively parallel sequencing.
Analysis of variants in known autoinflammatory genes led to the identification of the pathogenic p.F556L NLRP3 missense mutation in 17.7% of Illumina reads (25 of 141). No new candidate genes were identified. Massively parallel sequencing of DNA from peripheral blood (performed in duplicate) unequivocally confirmed the presence of this mutation in 14.5% of alleles. Reexamination of the original Sanger chromatograms revealed a small peak at nucleotide position c.1698 corresponding to the mutated allele. This had initially been regarded as background noise, but in retrospect is completely consistent with somatic mosaicism for the p.F556L NLRP3 mutation in this child with CINCA syndrome.
This is the first description of somatic NLRP3 mosaicism detected using whole-exome sequencing in a “mutation-negative” patient with CINCA syndrome. Our findings suggest that whole-exome sequencing could be an important diagnostic tool for detecting somatic mosaicism, as well as for the discovery of novel causative gene mutations, in patients with clinical features of cryopyrin-associated periodic syndromes who are NLRP3 mutation negative by conventional sequencing. This approach could also be applicable to patients with other autosomal-dominant autoinflammatory diseases characterized by gain-of-function mutations who are mutation negative by conventional Sanger sequencing.
PMCID: PMC3995009  PMID: 24431285
12.  Metabolomic Profiling Reveals Potential Markers and Bioprocesses Altered in Bladder Cancer Progression 
Cancer research  2011;71(24):7376-7386.
While alterations in xenobiotic metabolism are considered causal in the development of bladder cancer (BCa), the precise mechanisms involved are poorly understood. In this study, we used high-throughput mass spectrometry to measure over 2,000 compounds in 58 clinical specimens, identifying 35 metabolites which exhibited significant changes in BCa. This metabolic signature distinguished both normal and benign bladder from BCa. Exploratory analyses of this metabolomic signature in urine showed promise in distinguishing BCa from controls, and also non-muscle from muscle-invasive BCa. Subsequent enrichment-based bioprocess mapping revealed alterations in phase I/II metabolism and suggested a possible role for DNA methylation in perturbing xenobiotic metabolism in BCa. In particular, we validated tumor-associated hypermethylation in the CYP1A1 and CYP1B1 promoters of BCa tissues by bisulfite sequence analysis and methylation-specific PCR, and also by in vitro treatment of T-24 BCa cell line with the DNA demethylating agent 5-aza-2′-deoxycytidine. Further, we showed that expression of CYP1A1 and CYP1B1 was reduced significantly in an independent cohort of BCa specimens compared to matched benign adjacent tissues. In summary, our findings identified candidate diagnostic and prognostic markers and highlighted mechanisms associated with the silencing of xenobiotic metabolism. The metabolomic signature we describe offers potential as a urinary biomarker for early detection and staging of BCa, highlighting the utility of evaluating metabolomic profiles of cancer to gain insights into bioprocesses perturbed during tumor development and progression.
PMCID: PMC3249241  PMID: 21990318
13.  Genetic Architecture of Vitamin B12 and Folate Levels Uncovered Applying Deeply Sequenced Large Datasets 
PLoS Genetics  2013;9(6):e1003530.
Genome-wide association studies have mainly relied on common HapMap sequence variations. Recently, sequencing approaches have allowed analysis of low frequency and rare variants in conjunction with common variants, thereby improving the search for functional variants and thus the understanding of the underlying biology of human traits and diseases. Here, we used a large Icelandic whole genome sequence dataset combined with Danish exome sequence data to gain insight into the genetic architecture of serum levels of vitamin B12 (B12) and folate. Up to 22.9 million sequence variants were analyzed in combined samples of 45,576 and 37,341 individuals with serum B12 and folate measurements, respectively. We found six novel loci associating with serum B12 (CD320, TCN2, ABCD4, MMAA, MMACHC) or folate levels (FOLR3) and confirmed seven loci for these traits (TCN1, FUT6, FUT2, CUBN, CLYBL, MUT, MTHFR). Conditional analyses established that four loci contain additional independent signals. Interestingly, 13 of the 18 identified variants were coding and 11 of the 13 target genes have known functions related to B12 and folate pathways. Contrary to epidemiological studies we did not find consistent association of the variants with cardiovascular diseases, cancers or Alzheimer's disease although some variants demonstrated pleiotropic effects. Although to some degree impeded by low statistical power for some of these conditions, these data suggest that sequence variants that contribute to the population diversity in serum B12 or folate levels do not modify the risk of developing these conditions. Yet, the study demonstrates the value of combining whole genome and exome sequencing approaches to ascertain the genetic and molecular architectures underlying quantitative trait associations.
Author Summary
Genome-wide association studies have in recent years revealed a wealth of common variants associated with common diseases and phenotypes. We took advantage of the advances in sequencing technologies to study the association of low frequency and rare variants in conjunction with common variants with serum levels of vitamin B12 (B12) and folate in Icelanders and Danes. We found 18 independent signals in 13 loci associated with serum B12 or folate levels. Interestingly, 13 of the 18 identified variants are coding and 11 of the 13 target genes have known functions related to B12 and folate pathways. These data indicate that the target genes at all of the loci have been identified. Epidemiological studies have shown a relationship between serum B12 and folate levels and the risk of cardiovascular diseases, cancers, and Alzheimer's disease. We investigated association between the identified variants and these diseases but did not find consistent association.
PMCID: PMC3674994  PMID: 23754956
14.  Genome-wide mutational landscape of mucinous carcinomatosis peritonei of appendiceal origin 
Genome Medicine  2014;6(5):43.
Mucinous neoplasms of the appendix (MNA) are rare tumors which may progress from benign to malignant disease with an aggressive biological behavior. MNA is often diagnosed after metastasis to the peritoneal surfaces resulting in mucinous carcinomatosis peritonei (MCP). Genetic alterations in MNA are poorly characterized due to its low incidence, the hypo-cellularity of MCPs, and a lack of relevant pre-clinical models. As such, application of targeted therapies to this disease is limited to those developed for colorectal cancer and not based on molecular rationale.
We sequenced the whole exomes of 10 MCPs of appendiceal origin to identify genome-wide somatic mutations and copy number aberrations and validated significant findings in 19 additional cases.
Our study demonstrates that MNA has a different molecular makeup than colorectal cancer. Most tumors have co-existing oncogenic mutations in KRAS (26/29) and GNAS (20/29) and are characterized by downstream PKA activation. High-grade tumors are GNAS wild-type (5/6), suggesting they do not progress from low-grade tumors. MNAs do share some genetic alterations with colorectal cancer including gain of 1q (5/10), Wnt, and TGFβ pathway alterations. In contrast, mutations in TP53 (1/10) and APC (0/10), common in colorectal cancer, are rare in MNA. Concurrent activation of the KRAS and GNAS mediated signaling pathways appears to be shared with pancreatic intraductal papillary mucinous neoplasm.
MNA genome-wide mutational analysis reveals genetic alterations distinct from colorectal cancer, in support of its unique pathophysiology and suggests new targeted therapeutic opportunities.
PMCID: PMC4062050  PMID: 24944587
15.  Serum metabolomic profile as a means to distinguish stage of colorectal cancer 
Genome Medicine  2012;4(5):42.
Presently, colorectal cancer (CRC) is staged preoperatively by radiographic tests, and postoperatively by pathological evaluation of available surgical specimens. However, present staging methods do not accurately identify occult metastases. This has a direct effect on clinical management. Early identification of metastases isolated to the liver may enable surgical resection, whereas more disseminated disease may be best treated with palliative chemotherapy.
Sera from 103 patients with colorectal adenocarcinoma treated at the same tertiary cancer center were analyzed by proton nuclear magnetic resonance (1H NMR) spectroscopy and gas chromatography-mass spectroscopy (GC-MS). Metabolic profiling was done using both supervised pattern recognition and orthogonal partial least squares-discriminant analysis (O-PLS-DA) of the most significant metabolites, which enables comparison of the whole sample spectrum between groups. The metabolomic profiles generated from each platform were compared between the following groups: locoregional CRC (N = 42); liver-only metastases (N = 45); and extrahepatic metastases (N = 25).
The serum metabolomic profile associated with locoregional CRC was distinct from that associated with liver-only metastases, based on 1H NMR spectroscopy (P = 5.10 × 10-7) and GC-MS (P = 1.79 × 10-7). Similarly, the serum metabolomic profile differed significantly between patients with liver-only metastases and with extrahepatic metastases. The change in metabolomic profile was most markedly demonstrated on GC-MS (P = 4.75 × 10-5).
In CRC, the serum metabolomic profile changes markedly with metastasis, and site of disease also appears to affect the pattern of circulating metabolites. This novel observation may have clinical utility in enhancing staging accuracy and selecting patients for surgical or medical management. Additional studies are required to determine the sensitivity of this approach to detect subtle or occult metastatic disease.
PMCID: PMC3506908  PMID: 22583555
16.  Integrated genome and transcriptome sequencing identifies a novel form of hybrid and aggressive prostate cancer† 
The Journal of pathology  2012;227(1):53-61.
Next-generation sequencing is making sequence-based molecular pathology and personalized oncology viable. We selected an individual initially diagnosed with conventional but aggressive prostate adenocarcinoma and sequenced the genome and transcriptome from primary and metastatic tissues collected prior to hormone therapy. The histology-pathology and copy number profiles were remarkably homogeneous, yet it was possible to propose the quadrant of the prostate tumour that likely seeded the metastatic diaspora. Despite a homogeneous cell type, our transcriptome analysis revealed signatures of both luminal and neuroendocrine cell types. Remarkably, the repertoire of expressed but apparently private gene fusions, including C15orf21:MYC, recapitulated this biology. We hypothesize that the amplification and over-expression of the stem cell gene MSI2 may have contributed to the stable hybrid cellular identity. This hybrid luminal-neuroendocrine tumour appears to represent a novel and highly aggressive case of prostate cancer with unique biological features and, conceivably, a propensity for rapid progression to castrate-resistance. Overall, this work highlights the importance of integrated analyses of genome, exome and transcriptome sequences for basic tumour biology, sequence-based molecular pathology and personalized oncology.
PMCID: PMC3768138  PMID: 22294438
RNA sequencing; DNA sequencing; prostate cancer; fusion genes; neuroendocrine; personalized medicine; cancer genetics
17.  Targeted Re-Sequencing Identified rs3106189 at the 5′ UTR of TAPBP and rs1052918 at the 3′ UTR of TCF3 to Be Associated with the Overall Survival of Colorectal Cancer Patients 
PLoS ONE  2013;8(8):e70307.
Recent studies have demonstrated the power of deep re-sequencing of the whole genome or exome in understanding cancer genomes. However, targeted capture of selected genomic whole gene-body regions, rather than the whole exome, have several advantages: 1) the genes can be selected based on biology or a hypothesis; 2) mutations in promoter and intronic regions, which have important regulatory roles, can be investigated; and 3) less expensive than whole genome or whole exome sequencing. Therefore, we designed custom high-density oligonucleotide microarrays (NimbleGen Inc.) to capture approximately 1.7 Mb target regions comprising the genomic regions of 28 genes related to colorectal cancer including genes belonging to the WNT signaling pathway, as well as important transcription factors or colon-specific genes that are over expressed in colorectal cancer (CRC). The 1.7 Mb targeted regions were sequenced with a coverage ranged from 32× to 45× for the 28 genes. We identified a total of 2342 sequence variations in the CRC and corresponding adjacent normal tissues. Among them, 738 were novel sequence variations based on comparisons with the SNP database (dbSNP135). We validated 56 of 66 SNPs in a separate cohort of 30 CRC tissues using Sequenom MassARRAY iPLEX Platform, suggesting a validation rate of at least 85% (56/66). We found 15 missense mutations among the exonic variations, 21 synonymous SNPs that were predicted to change the exonic splicing motifs, 31 UTR SNPs that were predicted to occur at the transcription factor binding sites, 20 intronic SNPs located near the splicing sites, 43 SNPs in conserved transcription factor binding sites and 32 in CpG islands. Finally, we determined that rs3106189, localized to the 5′ UTR of antigen presenting tapasin binding protein (TAPBP), and rs1052918, localized to the 3′ UTR of transcription factor 3 (TCF3), were associated with overall survival of CRC patients.
PMCID: PMC3734069  PMID: 23940558
18.  APRIL is a novel clinical chemo-resistance biomarker in colorectal adenocarcinoma identified by gene expression profiling 
BMC Cancer  2009;9:434.
5-Fluorouracil(5FU) and oral analogues, such as capecitabine, remain one of the most useful agents for the treatment of colorectal adenocarcinoma. Low toxicity and convenience of administration facilitate use, however clinical resistance is a major limitation. Investigation has failed to fully explain the molecular mechanisms of resistance and no clinically useful predictive biomarkers for 5FU resistance have been identified. We investigated the molecular mechanisms of clinical 5FU resistance in colorectal adenocarcinoma patients in a prospective biomarker discovery project utilising gene expression profiling. The aim was to identify novel 5FU resistance mechanisms and qualify these as candidate biomarkers and therapeutic targets.
Putative treatment specific gene expression changes were identified in a transcriptomics study of rectal adenocarcinomas, biopsied and profiled before and after pre-operative short-course radiotherapy or 5FU based chemo-radiotherapy, using microarrays. Tumour from untreated controls at diagnosis and resection identified treatment-independent gene expression changes. Candidate 5FU chemo-resistant genes were identified by comparison of gene expression data sets from these clinical specimens with gene expression signatures from our previous studies of colorectal cancer cell lines, where parental and daughter lines resistant to 5FU were compared. A colorectal adenocarcinoma tissue microarray (n = 234, resected tumours) was used as an independent set to qualify candidates thus identified.
APRIL/TNFSF13 mRNA was significantly upregulated following 5FU based concurrent chemo-radiotherapy and in 5FU resistant colorectal adenocarcinoma cell lines but not in radiotherapy alone treated colorectal adenocarcinomas. Consistent withAPRIL's known function as an autocrine or paracrine secreted molecule, stromal but not tumour cell protein expression by immunohistochemistry was correlated with poor prognosis (p = 0.019) in the independent set. Stratified analysis revealed that protein expression of APRIL in the tumour stroma is associated with survival in adjuvant 5FU treated patients only (n = 103, p < 0.001), and is independently predictive of lack of clinical benefit from adjuvant 5FU [HR 6.25 (95%CI 1.48-26.32), p = 0.013].
A combined investigative model, analysing the transcriptional response in clinical tumour specimens and cancers cell lines, has identified APRIL, a novel chemo-resistance biomarker with independent predictive impact in 5FU-treated CRC patients, that may represent a target for novel therapeutics.
PMCID: PMC2801520  PMID: 20003335
19.  Eleven Candidate Susceptibility Genes for Common Familial Colorectal Cancer 
PLoS Genetics  2013;9(10):e1003876.
Hereditary factors are presumed to play a role in one third of colorectal cancer (CRC) cases. However, in the majority of familial CRC cases the genetic basis of predisposition remains unexplained. This is particularly true for families with few affected individuals. To identify susceptibility genes for this common phenotype, we examined familial cases derived from a consecutive series of 1514 Finnish CRC patients. Ninety-six familial CRC patients with no previous diagnosis of a hereditary CRC syndrome were included in the analysis. Eighty-six patients had one affected first-degree relative, and ten patients had two or more. Exome sequencing was utilized to search for genes harboring putative loss-of-function variants, because such alterations are likely candidates for disease-causing mutations. Eleven genes with rare truncating variants in two or three familial CRC cases were identified: UACA, SFXN4, TWSG1, PSPH, NUDT7, ZNF490, PRSS37, CCDC18, PRADC1, MRPL3, and AKR1C4. Loss of heterozygosity was examined in all respective cancer samples, and was detected in seven occasions involving four of the candidate genes. In all seven occasions the wild-type allele was lost (P = 0.0078) providing additional evidence that these eleven genes are likely to include true culprits. The study provides a set of candidate predisposition genes which may explain a subset of common familial CRC. Additional genetic validation in other populations is required to provide firm evidence for causality, as well as to characterize the natural history of the respective phenotypes.
Author Summary
Many individuals with a family history of colorectal cancer have no detectable germline mutation in the known cancer predisposing genes. We aimed to identify novel susceptibility genes for this common phenotype by performing exome sequencing on 96 independent cases with familial colorectal cancer. Eighty-six patients had one affected first-degree relative, and ten patients had two or more. None of the patients had a previous diagnosis of a hereditary syndrome. We focused our search on genes with rare variants, predicted to truncate the protein product, since these are likely candidates for disease predisposition. Using this approach we identified truncating germline variants in eleven genes, present in two or three independent familial colorectal cancer cases. We analyzed the respective tumor DNAs and found loss of the wild-type allele in seven out of seven occasions, involving four genes. No tumor showed loss of the mutant allele which provides us with additional evidence for disease causality. Further studies are required to provide firm evidence for pathogenicity. Genetic knowledge on confirmed predisposing genes can ultimately be translated into tools for cancer prevention and early diagnosis in individuals carrying predisposition alleles.
PMCID: PMC3798264  PMID: 24146633
20.  Gene expression and effects of orally active derivatives of fluoropyrimidine on gastric and colorectal cancer 
The effects of chemotherapy on gastrointestinal cancer are influenced by the chemotherapeutic sensitivity of the cancer cells. Determining the expression of genes related to chemotherapeutic sensitivity has been used as a molecular method. The aim of the study was to clarify the relationships between the expression of genes related to chemotherapeutic sensitivity and the effects of orally active derivatives of fluoropyrimidine on gastric and colorectal cancer. Forty-five patients who underwent adjuvant chemotherapy containing orally active derivatives of fluoropyrimidine after undergoing curative surgery for gastric or colorectal cancer were enrolled. Twenty-four patients had colorectal cancer and 21 patients had gastric cancer. Total RNA was extracted from formalin-fixed, paraffin-embedded specimens of the resected tumors, and the expression of 11 genes was measured using the RT-PCR method. We then analyzed the relationships between the gene expression and the postoperative relapse rate as well as the relationships between clinicopathological factors and postoperative relapse rate. The median observation period of the subjects was 41 months. Twelve out of the 21 gastric cancer patients (57%) and 11 out of the 24 colorectal cancer patients (46%) relapsed. Although the results of a univariate analysis revealed that expression of none of the evaluated genes was related to relapse in the gastric cancer patients, excision repair cross-complementing gene 1 (ERCC1) overexpression was related to the relapse rate in colorectal cancer patients (p=0.023). When 1.295 was set as the cut-off value for ERCC1 overexpression using the receiver operating characteristic (ROC) curve, 67% of patients with ERCC1 overexpression and 25% of patients without ERCC1 overexpression relapsed. The relapse-free survival rate was lower in the group with ERCC1 overexpression than in the group without ERCC1 overexpression (p=0.046). ERCC1 overexpression appears to be a useful predictor of relapse in colorectal cancer patients receiving adjuvant therapy with regimens including orally active derivatives of fluoropyrimidine.
PMCID: PMC3445948  PMID: 22993546
excision repair cross-complementing gene 1 overexpression; relapse; colorectal cancer; fluoropyrimidine
21.  FLAGS, frequently mutated genes in public exomes 
BMC Medical Genomics  2014;7(1):64.
Dramatic improvements in DNA-sequencing technologies and computational analyses have led to wide use of whole exome sequencing (WES) to identify the genetic basis of Mendelian disorders. More than 180 novel rare-disease-causing genes with Mendelian inheritance patterns have been discovered through sequencing the exomes of just a few unrelated individuals or family members. As rare/novel genetic variants continue to be uncovered, there is a major challenge in distinguishing true pathogenic variants from rare benign mutations.
We used publicly available exome cohorts, together with the dbSNP database, to derive a list of genes (n = 100) that most frequently exhibit rare (<1%) non-synonymous/splice-site variants in general populations. We termed these genes FLAGS for FrequentLy mutAted GeneS and analyzed their properties.
Analysis of FLAGS revealed that these genes have significantly longer protein coding sequences, a greater number of paralogs and display less evolutionarily selective pressure than expected. FLAGS are more frequently reported in PubMed clinical literature and more frequently associated with diseased phenotypes compared to the set of human protein-coding genes. We demonstrated an overlap between FLAGS and the rare-disease causing genes recently discovered through WES studies (n = 10) and the need for replication studies and rigorous statistical and biological analyses when associating FLAGS to rare disease. Finally, we showed how FLAGS are applied in disease-causing variant prioritization approach on exome data from a family affected by an unknown rare genetic disorder.
We showed that some genes are frequently affected by rare, likely functional variants in general population, and are frequently observed in WES studies analyzing diverse rare phenotypes. We found that the rate at which genes accumulate rare mutations is beneficial information for prioritizing candidates. We provided a ranking system based on the mutation accumulation rates for prioritizing exome-captured human genes, and propose that clinical reports associating any disease/phenotype to FLAGS be evaluated with extra caution.
Electronic supplementary material
The online version of this article (doi:10.1186/s12920-014-0064-y) contains supplementary material, which is available to authorized users.
PMCID: PMC4267152  PMID: 25466818
22.  Metabolomics of Apc Min/+ mice genetically susceptible to intestinal cancer 
BMC Systems Biology  2014;8:72.
To determine how diets high in saturated fat could increase polyp formation in the mouse model of intestinal neoplasia, Apc Min/+ , we conducted large-scale metabolome analysis and association study of colon and small intestine polyp formation from plasma and liver samples of Apc Min/+ vs. wild-type littermates, kept on low vs. high-fat diet. Label-free mass spectrometry was used to quantify untargeted plasma and acyl-CoA liver compounds, respectively. Differences in contrasts of interest were analyzed statistically by unsupervised and supervised modeling approaches, namely Principal Component Analysis and Linear Model of analysis of variance. Correlation between plasma metabolite concentrations and polyp numbers was analyzed with a zero-inflated Generalized Linear Model.
Plasma metabolome in parallel to promotion of tumor development comprises a clearly distinct profile in Apc Min/+ mice vs. wild type littermates, which is further altered by high-fat diet. Further, functional metabolomics pathway and network analyses in Apc Min/+ mice on high-fat diet revealed associations between polyp formation and plasma metabolic compounds including those involved in amino-acids metabolism as well as nicotinamide and hippuric acid metabolic pathways. Finally, we also show changes in liver acyl-CoA profiles, which may result from a combination of Apc Min/+ -mediated tumor progression and high fat diet. The biological significance of these findings is discussed in the context of intestinal cancer progression.
These studies show that high-throughput metabolomics combined with appropriate statistical modeling and large scale functional approaches can be used to monitor and infer changes and interactions in the metabolome and genome of the host under controlled experimental conditions. Further these studies demonstrate the impact of diet on metabolic pathways and its relation to intestinal cancer progression. Based on our results, metabolic signatures and metabolic pathways of polyposis and intestinal carcinoma have been identified, which may serve as useful targets for the development of therapeutic interventions.
PMCID: PMC4099115  PMID: 24954394
Metabolomics; Fat diet; Tumor development; Association and correlation analysis; High-throughput mass spectrometry
23.  Intra-tumor Genetic Heterogeneity and Mortality in Head and Neck Cancer: Analysis of Data from The Cancer Genome Atlas 
PLoS Medicine  2015;12(2):e1001786.
Although the involvement of intra-tumor genetic heterogeneity in tumor progression, treatment resistance, and metastasis is established, genetic heterogeneity is seldom examined in clinical trials or practice. Many studies of heterogeneity have had prespecified markers for tumor subpopulations, limiting their generalizability, or have involved massive efforts such as separate analysis of hundreds of individual cells, limiting their clinical use. We recently developed a general measure of intra-tumor genetic heterogeneity based on whole-exome sequencing (WES) of bulk tumor DNA, called mutant-allele tumor heterogeneity (MATH). Here, we examine data collected as part of a large, multi-institutional study to validate this measure and determine whether intra-tumor heterogeneity is itself related to mortality.
Methods and Findings
Clinical and WES data were obtained from The Cancer Genome Atlas in October 2013 for 305 patients with head and neck squamous cell carcinoma (HNSCC), from 14 institutions. Initial pathologic diagnoses were between 1992 and 2011 (median, 2008). Median time to death for 131 deceased patients was 14 mo; median follow-up of living patients was 22 mo. Tumor MATH values were calculated from WES results. Despite the multiple head and neck tumor subsites and the variety of treatments, we found in this retrospective analysis a substantial relation of high MATH values to decreased overall survival (Cox proportional hazards analysis: hazard ratio for high/low heterogeneity, 2.2; 95% CI 1.4 to 3.3). This relation of intra-tumor heterogeneity to survival was not due to intra-tumor heterogeneity’s associations with other clinical or molecular characteristics, including age, human papillomavirus status, tumor grade and TP53 mutation, and N classification. MATH improved prognostication over that provided by traditional clinical and molecular characteristics, maintained a significant relation to survival in multivariate analyses, and distinguished outcomes among patients having oral-cavity or laryngeal cancers even when standard disease staging was taken into account. Prospective studies, however, will be required before MATH can be used prognostically in clinical trials or practice. Such studies will need to examine homogeneously treated HNSCC at specific head and neck subsites, and determine the influence of cancer therapy on MATH values. Analysis of MATH and outcome in human-papillomavirus-positive oropharyngeal squamous cell carcinoma is particularly needed.
To our knowledge this study is the first to combine data from hundreds of patients, treated at multiple institutions, to document a relation between intra-tumor heterogeneity and overall survival in any type of cancer. We suggest applying the simply calculated MATH metric of heterogeneity to prospective studies of HNSCC and other tumor types.
In this study, Rocco and colleagues examine data collected as part of a large, multi-institutional study, to validate a measure of tumor heterogeneity called MATH and determine whether intra-tumor heterogeneity is itself related to mortality.
Editors’ Summary
Normally, the cells in human tissues and organs only reproduce (a process called cell division) when new cells are needed for growth or to repair damaged tissues. But sometimes a cell somewhere in the body acquires a genetic change (mutation) that disrupts the control of cell division and allows the cell to grow continuously. As the mutated cell grows and divides, it accumulates additional mutations that allow it to grow even faster and eventually from a lump, or tumor (cancer). Other mutations subsequently allow the tumor to spread around the body (metastasize) and destroy healthy tissues. Tumors can arise anywhere in the body—there are more than 200 different types of cancer—and about one in three people will develop some form of cancer during their lifetime. Many cancers can now be successfully treated, however, and people often survive for years after a diagnosis of cancer before, eventually, dying from another disease.
Why Was This Study Done?
The gradual acquisition of mutations by tumor cells leads to the formation of subpopulations of cells, each carrying a different set of mutations. This “intra-tumor heterogeneity” can produce tumor subclones that grow particularly quickly, that metastasize aggressively, or that are resistant to cancer treatments. Consequently, researchers have hypothesized that high intra-tumor heterogeneity leads to worse clinical outcomes and have suggested that a simple measure of this heterogeneity would be a useful addition to the cancer staging system currently used by clinicians for predicting the likely outcome (prognosis) of patients with cancer. Here, the researchers investigate whether a measure of intra-tumor heterogeneity called “mutant-allele tumor heterogeneity” (MATH) is related to mortality (death) among patients with head and neck squamous cell carcinoma (HNSCC)—cancers that begin in the cells that line the moist surfaces inside the head and neck, such as cancers of the mouth and the larynx (voice box). MATH is based on whole-exome sequencing (WES) of tumor and matched normal DNA. WES uses powerful DNA-sequencing systems to determine the variations of all the coding regions (exons) of the known genes in the human genome (genetic blueprint).
What Did the Researchers Do and Find?
The researchers obtained clinical and WES data for 305 patients who were treated in 14 institutions, primarily in the US, after diagnosis of HNSCC from The Cancer Genome Atlas, a catalog established by the US National Institutes of Health to map the key genomic changes in major types and subtypes of cancer. They calculated tumor MATH values for the patients from their WES results and retrospectively analyzed whether there was an association between the MATH values and patient survival. Despite the patients having tumors at various subsites and being given different treatments, every 10% increase in MATH value corresponded to an 8.8% increased risk (hazard) of death. Using a previously defined MATH-value cutoff to distinguish high- from low-heterogeneity tumors, compared to patients with low-heterogeneity tumors, patients with high-heterogeneity tumors were more than twice as likely to die (a hazard ratio of 2.2). Other statistical analyses indicated that MATH provided improved prognostic information compared to that provided by established clinical and molecular characteristics and human papillomavirus (HPV) status (HPV-positive HNSCC at some subsites has a better prognosis than HPV-negative HNSCC). In particular, MATH provided prognostic information beyond that provided by standard disease staging among patients with mouth or laryngeal cancers.
What Do These Findings Mean?
By using data from more than 300 patients treated at multiple institutions, these findings validate the use of MATH as a measure of intra-tumor heterogeneity in HNSCC. Moreover, they provide one of the first large-scale demonstrations that intra-tumor heterogeneity is clinically important in the prognosis of any type of cancer. Before the MATH metric can be used in clinical trials or in clinical practice as a prognostic tool, its ability to predict outcomes needs to be tested in prospective studies that examine the relation between MATH and the outcomes of patients with identically treated HNSCC at specific head and neck subsites, that evaluate the use of MATH for prognostication in other tumor types, and that determine the influence of cancer treatments on MATH values. Nevertheless, these findings suggest that MATH should be considered as a biomarker for survival in HNSCC and other tumor types, and raise the possibility that clinicians could use MATH values to decide on the best treatment for individual patients and to choose patients for inclusion in clinical trials.
Additional Information
Please access these websites via the online version of this summary at
The US National Cancer Institute (NCI) provides information about cancer and how it develops and about head and neck cancer (in English and Spanish)
Cancer Research UK, a not-for-profit organization, provides general information about cancer and how it develops, and detailed information about head and neck cancer; the Merseyside Regional Head and Neck Cancer Centre provides patient stories about HNSCC
Wikipedia provides information about tumor heterogeneity, and about whole-exome sequencing (note that Wikipedia is a free online encyclopedia that anyone can edit; available in several languages)
Information about The Cancer Genome Atlas is available
A PLOS Blog entry by Jessica Wapner explains more about MATH
PMCID: PMC4323109  PMID: 25668320
24.  RNA expression of the molecular signature genes for metastasis in colorectal Cancer 
Oncology reports  2011;25(5):1321-1327.
Colorectal cancer is an endemic disease in the western world. Search for molecular signatures present in primary tumors that predict tumor metastasis potential has been proposed and in particular, a 17-gene molecular signature is associated with poor survival in breast cancer, prostate cancer, meduloblastoma, and lymphoma in a recent study. Using quantitative real-time PCR assay (qPCR), our study observed tumor-normal differential RNA expression in 15 of these 17 genes in a cohort of 52 Stage III colorectal cancer patients (all p<0.05), which signified the importance of these 17 signature genes in colorectal cancer. Although no significant correlation was found between tumor RNA levels of these 17 genes and some of clinical features (age, gender, and location, all p>0.05), two distinct groups among these genes were observed with Spearman correlation scores >0.6 (p<0.01), suggesting co-expression/interaction within these genes. Of the 37 patients who had complete follow-up data available, 12 patients had recurrence and 25 had no recurrence. There was no significant difference in tumor RNA level between recurrence and non-recurrence groups for the 17 genes (all p>0.05), but the recurrence group had more patients with mucinous tumors (9/12 vs. 7/25, P<0.05) and more lymph node involvement (median 7.2 vs. 2.5, p<0.05) than the non-recurrence group. Moreover, survival analysis revealed a significant difference in patient overall survival time between low and high tumor RNA levels for 1 of the 17 genes (PTTG1, p=0.024). Our qPCR validation study confirms the importance of most 17-gene molecular signature genes with differential RNA expression, and suggests the survival relevance of PTTG1 in colorectal cancers.
PMCID: PMC3954856  PMID: 21380492
gene expression; metastasis; colorectal cancer; quantitative PCR
25.  Efficiency of whole genome amplification of single circulating tumor cells enriched by CellSearch and sorted by FACS 
Genome Medicine  2013;5(11):106.
Tumor cells in the blood of patients with metastatic carcinomas are associated with poor survival. Knowledge of the cells’ genetic make-up can help to guide targeted therapy. We evaluated the efficiency and quality of isolation and amplification of DNA from single circulating tumor cells (CTC).
The efficiency of the procedure was determined by spiking blood with SKBR-3 cells, enrichment with the CellSearch system, followed by single cell sorting by fluorescence-activated cell sorting (FACS) and whole genome amplification. A selection of single cell DNA from fixed and unfixed SKBR-3 cells was exome sequenced and the DNA quality analyzed. Single CTC from patients with lung cancer were used to demonstrate the potential of single CTC molecular characterization.
The overall efficiency of the procedure from spiked cell to amplified DNA was approximately 20%. Losses attributed to the CellSearch system were around 20%, transfer to FACS around 25%, sorting around 5% and DNA amplification around 25%. Exome sequencing revealed that the quality of the DNA was affected by the fixation of the cells, amplification, and the low starting quantity of DNA. A single fixed cell had an average coverage at 20× depth of 30% when sequencing to an average of 40× depth, whereas a single unfixed cell had 45% coverage. GenomiPhi-amplified genomic DNA had a coverage of 72% versus a coverage of 87% of genomic DNA. Twenty-one percent of the CTC from patients with lung cancer identified by the CellSearch system could be isolated individually and amplified.
CTC enriched by the CellSearch system were sorted by FACS, and DNA retrieved and amplified with an overall efficiency of 20%. Analysis of the sequencing data showed that this DNA could be used for variant calling, but not for quantitative measurements such as copy number detection. Close to 55% of the exome of single SKBR-3 cells were successfully sequenced to 20× depth making it possible to call 72% of the variants. The overall coverage was reduced to 30% at 20× depth, making it possible to call 56% of the variants in CellSave-fixed cells.
PMCID: PMC3978840  PMID: 24286536

Results 1-25 (1044468)