Gliomas, the most common primary brain tumors in adults, constitute clinically, histologically, and molecularly a most heterogeneous type of cancer. Owing to this, accurate clinical prognosis for short-term vs. long-term survival for patients with grade II or III glioma is currently nonexistent. A rigorous, multi-method bioinformatic approach was used to identify the top most differentially expressed genes as captured by mRNA sequencing of tumor tissue. Mathematical modeling was employed to develop the model, and three different and independent methods of validation were used to assess its performance. I present here a mathematical model that can identify with a high accuracy (sensitivity=92.9%, specificity=96.0%) those patients with glioma (grade II or III) who will experience short-term survival (≤ 1 year), as well as those with long-term survival (≥ 3 years), at the time of diagnosis and prior to surgery and adjuvant chemotherapy. The 5 gene input variables to the model are: FAM120AOS, PDLIM4, OCIAD2, PCDH15, and MXI1. MXI1, a transcriptional repressor, represents the top biomarker of survival and the most promising target for the development of a pharmacological treatment.
Glioma; cancer genomics; survival; computational biology; mathematical modeling; systems biology; RNA-Sequencing
Nuclear magnetic resonance (NMR) spectroscopy is a rapidly emerging technology that can be used to assess tissue metabolic profile in the living animal. At the present time, no approach has been developed 1) to systematically identify profiles of key chemical alterations that can be used as biomarkers to diagnose diseases and to monitor disease progression; and 2) to assess mathematically the diagnostic power of potential biomarkers. To address this issue, we have evaluated mathematical approaches that employ receiver operating characteristic (ROC) curve analysis, linear discriminant analysis, and logistic regression analysis to systematically identify key biomarkers from NMR spectra that have excellent diagnostic power and can be used accurately for disease diagnosis and monitoring. To validate our mathematical approaches, we studied the striatal concentrations of 17 metabolites of 13 R6/ 2 transgenic mice with Huntington's disease, as well as those of 17 wild-type (WT) mice, which were obtained via in vivo proton NMR spectroscopy (9.4 Tesla). We developed diagnostic biomarker models and clinical change assessment models based on our three aforementioned mathematical approaches, and we tested all of them, first, with the 30 original mice and, then, with 31 unknown mice. Their prediction results were compared with genotyping—the gold standard. All models correctly diagnosed all of the 30 original mice (17 WT and 13 R6/2) and all of the 31 unknown mice (20 WT and 11 R6/2), with a positive likelihood ratio approximating infinity [1/0 (→ ∞)], and with a negative likelihood ratio equal to zero [0/1 = 0].
receiver operating characteristic (ROC) curve analysis; linear discriminant analysis; logistic regression analysis; proton magnetic resonance spectroscopy; metabolomics; Huntington's disease
Nuclear magnetic resonance (NMR) spectroscopy has emerged as a technology that can provide metabolite information within organ systems in vivo. In this study, we introduced a new method of employing a clustering algorithm to develop a diagnostic model that can differentially diagnose a single unknown subject in a disease with well-defined group boundaries. We used three tests to assess the suitability and the accuracy required for diagnostic purposes of the four clustering algorithms we investigated (K-means, Fuzzy, Hierarchical, and Medoid Partitioning). To accomplish this goal, we studied the striatal metabolomic profile of R6/2 Huntington disease (HD) transgenic mice and that of wild type (WT) mice using high field in vivo proton NMR spectroscopy (9.4 Tesla). We tested all four clustering algorithms 1) with the original R6/2 HD mice and WT mice, 2) with unknown mice, whose status had been determined via genotyping, and 3) with the ability to separate the original R6/2 mice into the two age subgroups (8 and 12 wks old). Only our diagnostic models that employed ROC-supervised Fuzzy, unsupervised Fuzzy, and ROC-supervised K-means clustering passed all three stringent tests with 100% accuracy, indicating that they may be used for diagnostic purposes.
Diagnostic Methods; Clustering Analyses; K-Means Clustering; Fuzzy Clustering; Medoid Partitioning Clustering; Hierarchical Clustering; Receiver Operating Characteristic (ROC) Curve Analysis; Nuclear Magnetic Resonance Spectroscopy; Metabolomics; Huntington Disease
Principal component analysis (PCA) is a data analysis method that can deal with large volumes of data. Owing to the complexity and volume of the data generated by today's advanced technologies in genomics, proteomics, and metabolomics, PCA has become predominant in the medical sciences. Despite its popularity, PCA leaves much to be desired in terms of accuracy and may not be suitable for certain medical applications, such as diagnostics, where accuracy is paramount. In this study, we introduced a new PCA method, one that is carefully supervised by receiver operating characteristic (ROC) curve analysis. In order to assess its performance with respect to its ability to render an accurate differential diagnosis, and to compare its performance with that of standard PCA, we studied the striatal metabolomic profile of R6/2 Huntington disease (HD) transgenic mice, as well as that of wild type (WT) mice, using high field in vivo proton nuclear magnetic resonance (NMR) spectroscopy (9.4-Tesla). We tested both the standard PCA and our ROC-supervised PCA (using in each case both the covariance and the correlation matrix), 1) with the original R6/2 HD mice and WT mice, 2) with unknown mice, whose status had been determined via genotyping, and 3) with the ability to separate the original R6/2 mice into the two age subgroups (8 and 12 wks old). Only our ROC-supervised PCA (both with the covariance and the correlation matrix) passed all tests with a total accuracy of 100%; thus, providing evidence that it may be used for diagnostic purposes.
Diagnostic methods; principal component analysis; receiver operating characteristic (ROC) curve analysis; metabolomics; nuclear magnetic resonance spectroscopy; huntington disease
Ovarian cancer is a clinically and molecularly heterogeneous disease. The driving forces behind this variability are unknown. Here we report wide variation in expression of the DNA cytosine deaminase APOBEC3B, with elevated expression in a majority of ovarian cancer cell lines (3 standard deviations above the mean of normal ovarian surface epithelial cells) and high grade primary ovarian cancers. APOBEC3B is active in the nucleus of several ovarian cancer cell lines and elicits a biochemical preference for deamination of cytosines in 5′TC dinucleotides. Importantly, examination of whole-genome sequence from 16 ovarian cancers reveals that APOBEC3B expression correlates with total mutation load as well as elevated levels of transversion mutations. In particular, high APOBEC3B expression correlates with C-to-A and C-to-G transversion mutations within 5′TC dinucleotide motifs in early-stage high grade serous ovarian cancer genomes, suggesting that APOBEC3B-catalyzed genomic uracil lesions are further processed by downstream DNA ‘repair’ enzymes including error-prone translesion polymerases. These data identify a potential role for APOBEC3B in serous ovarian cancer genomic instability.
APOBEC3B; DNA cytosine deamination; genomic uracil; ovarian cancer; transversion mutations
Multiple mutations are required for cancer development, and genome
sequencing has revealed that several cancers, including breast, have somatic
mutation spectra dominated by C-to-T transitions1–9.
Most of these mutations occur at hydrolytically disfavored10 non-methylated cytosines
throughout the genome, and are sometimes clustered8. Here, we show that the DNA cytosine deaminase
APOBEC3B (A3B) is a likely source of these mutations. A3B mRNA
is up-regulated in the majority of primary breast tumors and breast cancer cell
lines. Tumors that express high levels of A3B have twice as
many mutations as those that express low levels and are more likely to have
mutations in TP53. Endogenous A3B protein is predominantly
nuclear and the only detectable source of DNA C-to-U editing activity in breast
cancer cell line extracts. Knockdown experiments show that endogenous A3B
correlates with elevated levels of genomic uracil, increased mutation
frequencies, and C-to-T transitions. Furthermore, induced A3B over-expression
causes cell cycle deviations, cell death, DNA fragmentation, γ-H2AX
accumulation, and C-to-T mutations. Our data suggest a model in which
A3B-catalyzed deamination provides a chronic source of DNA damage in breast
cancers that could select TP53 inactivation and explain how
some tumors evolve rapidly and manifest heterogeneity.
Memory and learning declines are consequences of normal aging. Since those functions are associated with the hippocampus, I analyzed the global gene expression data from post-mortem hippocampal tissue of 25 old (age ≥ 60 yrs) and 15 young (age ≤ 45 yrs) cognitively intact human subjects. By employing a rigorous, multi-method bioinformatic approach, I identified 36 genes that were the most significant in terms of differential expression; and by employing mathematical modeling, I demonstrated that 7 of the 36 genes were able to discriminate between the old and young subjects with high accuracy. Remarkably, 90% of the known genes from those 36 most significant genes are associated with either inflammation or immune system activation. This suggests that chronic inflammation and immune system over-activity may underlie the aging process of the human brain, and that potential anti-inflammatory treatments targeting those genes may slow down this process and alleviate its symptoms.
Pertaining to the female population in the USA, breast cancer is the leading cancer in terms of annual incidence rate and, in terms of mortality, the second most lethal cancer. There are currently no biomarkers available that can predict which breast cancer patients will respond to chemotherapy with both sensitivity and specificity > 80%, as mandated by the latest FDA requirements. In this study, we have developed a prognostic biomarker model (complex mathematical function) that—based on global gene expression analysis of tumor tissue collected during biopsy and prior to the commencement of chemotherapy—can identify with a high accuracy those patients with breast cancer (clinical stages I–III) who will respond to the paclitaxel-fluorouracil-doxorubicin-cyclophosphamide chemotherapy and will experience pathological complete response (Responders), as well as those breast cancer patients (clinical stages I–III) who will not do so (Non-Responders). Most importantly, both the application and the accuracy of our breast cancer prognostic biomarker model are independent of the status of the hormone receptors ER, PR, and HER2, as well as of the ethnicity and age of the subjects. We developed our prognostic biomarker model with 50 subjects [10 responders (R) and 40 non-responders (NR)], and we validated it with 43 unknown (new and different) subjects [10 responders (R) and 33 non-responders (NR)]. All 93 subjects were recruited at five different clinical centers around the world. The overall sensitivity and specificity of our prognostic biomarker model were 90.0% and 91.8%, respectively. The nine most significant genes identified, which comprise the input variables to the mathematical function, are involved in regulation of transcription; cell proliferation, invasion, and migration; oncogenesis; suppression of immune response; and drug resistance and cancer recurrence.
breast cancer; biomarkers; prognostic biomarker models; treatment response; global gene expression analysis; systems biology
Early detection (localized stage) of colon cancer is associated with a five-year survival rate of 91%. Only 39% of colon cancers, however, are diagnosed at that early stage. Early and accurate diagnosis, therefore, constitutes a critical need and a decisive factor in the clinical treatment of colon cancer and its success. In this study, using supervised linear discriminant analysis, we have developed three diagnostic biomarker models that—based on global micro-RNA expression analysis of colonic tissue collected during surgery—can discriminate with a perfect accuracy between subjects with colon cancer (stages II–IV) and normal healthy subjects. We developed our three diagnostic biomarker models with 57 subjects [40 with colon cancer (stages II–IV) and 17 normal], and we validated them with 39 unknown (new and different) subjects [28 with colon cancer (stages II–IV) and 11 normal]. For all three diagnostic models, both the overall sensitivity and specificity were 100%. The nine most significant micro-RNAs identified, which comprise the input variables to the three linear discriminant functions, are associated with genes that regulate oncogenesis, and they play a paramount role in the development of colon cancer, as evidenced in the tumor tissue itself. This could have a significant impact in the fight against this disease, in that it may lead to the development of an early serum or blood diagnostic test based on the detection of those nine key micro-RNAs.
colon cancer; ROC-supervised linear discriminant analysis; biomarkers; diagnostic biomarker models; global micro-RNA expression analysis; systems biology
Following initial standard chemotherapy (platinum/taxol), more than 75% of those patients with advanced stage epithelial ovarian cancer (EOC) experience a recurrence. There are currently no accurate prognostic tests that, at the time of the diagnosis/surgery, can identify those patients with advanced stage EOC who will respond to chemotherapy. Using a novel mathematical theory, we have developed three prognostic biomarker models (complex mathematical functions) that—based on a global gene expression analysis of tumor tissue collected during surgery and prior to the commencement of chemotherapy—can identify with a high accuracy those patients with advanced stage EOC who will respond to the standard chemotherapy [long-term survivors (>7 yrs)] and those who will not do so [short-term survivors (<3 yrs)]. Our three prognostic biomarker models were developed with 34 subjects and validated with 20 unknown (new and different) subjects. Both the overall biomarker model sensitivity and specificity ranged from 95.83% to 100.00%. The 12 most significant genes identified, which are also the input variables to the three mathematical functions, constitute three distinct gene networks with the following functions: 1) production of cytoskeletal components, 2) cell proliferation, and 3) cell energy production. The first gene network is directly associated with the mechanism of action of anti-tubulin chemotherapeutic agents, such as taxanes and epothilones. This could have a significant impact in the discovery of new, more effective pharmacological treatments that may significantly extend the survival of patients with advanced stage EOC.
ovarian cancer; biomarkers; mathematical models; prognostic biomarker models; treatment response; survival; global gene expression analysis
Current treatment for Duchenne Muscular Dystrophy (DMD) is chronic administration of the glucocorticoid prednisolone. Prednisolone improves muscle strength in boys with DMD, but the mechanism is unknown. The purpose of this study was to determine how prednisolone improves muscle strength by examining muscle contractility in dystrophic mice over time and in conjunction with eccentric injury. Mdx mice began receiving prednisolone (n=23) or placebo (n=16) at 5-wks of age. Eight wks of prednisolone increased specific force of the EDL muscle 26%, but other parameters of contractility were not affected. Prednisolone also improved the histological appearance of muscle by decreasing the number of centrally-nucleated fibers. Prednisolone treatment did not affect force loss during eccentric contractions or recovery of force following injury. These data are of clinical relevance, because the increase in muscle strength in boys with DMD taking prednisolone does not appear to occur via the same mechanism in dystrophic mice.
Duchenne Muscular Dystrophy; skeletal muscle function; glucocorticoids