Search tips
Search criteria

Results 1-5 (5)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
Document Types
1.  Using cross-validation to evaluate predictive accuracy of survival risk classifiers based on high-dimensional data 
Briefings in Bioinformatics  2011;12(3):203-214.
Developments in whole genome biotechnology have stimulated statistical focus on prediction methods. We review here methodology for classifying patients into survival risk groups and for using cross-validation to evaluate such classifications. Measures of discrimination for survival risk models include separation of survival curves, time-dependent ROC curves and Harrell’s concordance index. For high-dimensional data applications, however, computing these measures as re-substitution statistics on the same data used for model development results in highly biased estimates. Most developments in methodology for survival risk modeling with high-dimensional data have utilized separate test data sets for model evaluation. Cross-validation has sometimes been used for optimization of tuning parameters. In many applications, however, the data available are too limited for effective division into training and test sets and consequently authors have often either reported re-substitution statistics or analyzed their data using binary classification methods in order to utilize familiar cross-validation. In this article we have tried to indicate how to utilize cross-validation for the evaluation of survival risk models; specifically how to compute cross-validated estimates of survival distributions for predicted risk groups and how to compute cross-validated time-dependent ROC curves. We have also discussed evaluation of the statistical significance of a survival risk model and evaluation of whether high-dimensional genomic data adds predictive accuracy to a model based on standard covariates alone.
PMCID: PMC3105299  PMID: 21324971
predictive medicine; survival risk classification; cross-validation; gene expression
2.  GeneMed: An Informatics Hub for the Coordination of Next-Generation Sequencing Studies that Support Precision Oncology Clinical Trials 
Cancer Informatics  2015;14(Suppl 2):45-55.
We have developed an informatics system, GeneMed, for the National Cancer Institute (NCI) molecular profiling-based assignment of cancer therapy (MPACT) clinical trial (NCT01827384) being conducted in the National Institutes of Health (NIH) Clinical Center. This trial is one of the first to use a randomized design to examine whether assigning treatment based on genomic tumor screening can improve the rate and duration of response in patients with advanced solid tumors. An analytically validated next-generation sequencing (NGS) assay is applied to DNA from patients’ tumors to identify mutations in a panel of genes that are thought likely to affect the utility of targeted therapies available for use in the clinical trial. The patients are randomized to a treatment selected to target a somatic mutation in the tumor or with a control treatment. The GeneMed system streamlines the workflow of the clinical trial and serves as a communications hub among the sequencing lab, the treatment selection team, and clinical personnel. It automates the annotation of the genomic variants identified by sequencing, predicts the functional impact of mutations, identifies the actionable mutations, and facilitates quality control by the molecular characterization lab in the review of variants. The GeneMed system collects baseline information about the patients from the clinic team to determine eligibility for the panel of drugs available. The system performs randomized treatment assignments under the oversight of a supervising treatment selection team and generates a patient report containing detected genomic alterations. NCI is planning to expand the MPACT trial to multiple cancer centers soon. In summary, the GeneMed system has been proven to be an efficient and successful informatics hub for coordinating the reliable application of NGS to precision medicine studies.
PMCID: PMC4368061  PMID: 25861217
GeneMed; MPACT; next-generation sequencing; precision medicine; informatics system; clinical trial
3.  Analysis of Gene Expression Data Using BRB-Array Tools 
Cancer Informatics  2007;3:11-17.
BRB-ArrayTools is an integrated software system for the comprehensive analysis of DNA microarray experiments. It was developed by professional biostatisticians experienced in the design and analysis of DNA microarray studies and incorporates methods developed by leading statistical laboratories. The software is designed for use by biomedical scientists who wish to have access to state-of-the-art statistical methods for the analysis of gene expression data and to receive training in the statistical analysis of high dimensional data. The software provides the most extensive set of tools available for predictive classifier development and complete cross-validation. It offers extensive links to genomic websites for gene annotation and analysis tools for pathway analysis. An archive of over 100 datasets of published microarray data with associated clinical data is provided and BRB-ArrayTools automatically imports data from the Gene Expression Omnibus public archive at the National Center for Biotechnology Information.
PMCID: PMC2675854  PMID: 19455231
Bioinformatics; microarrays; gene expression; biostatistics
4.  Histological staining methods preparatory to laser capture microdissection significantly affect the integrity of the cellular RNA 
BMC Genomics  2006;7:97.
Gene expression profiling by microarray analysis of cells enriched by laser capture microdissection (LCM) faces several technical challenges. Frozen sections yield higher quality RNA than paraffin-imbedded sections, but even with frozen sections, the staining methods used for histological identification of cells of interest could still damage the mRNA in the cells. To study the contribution of staining methods to degradation of results from gene expression profiling of LCM samples, we subjected pellets of the mouse plasma cell tumor cell line TEPC 1165 to direct RNA extraction and to parallel frozen sectioning for LCM and subsequent RNA extraction. We used microarray hybridization analysis to compare gene expression profiles of RNA from cell pellets with gene expression profiles of RNA from frozen sections that had been stained with hematoxylin and eosin (H&E), Nissl Stain (NS), and for immunofluorescence (IF) as well as with the plasma cell-revealing methyl green pyronin (MGP) stain. All RNAs were amplified with two rounds of T7-based in vitro transcription and analyzed by two-color expression analysis on 10-K cDNA microarrays.
The MGP-stained samples showed the least introduction of mRNA loss, followed by H&E and immunofluorescence. Nissl staining was significantly more detrimental to gene expression profiles, presumably owing to an aqueous step in which RNA may have been damaged by endogenous or exogenous RNAases.
RNA damage can occur during the staining steps preparatory to laser capture microdissection, with the consequence of loss of representation of certain genes in microarray hybridization analysis. Inclusion of RNAase inhibitor in aqueous staining solutions appears to be important in protecting RNA from loss of gene transcripts.
PMCID: PMC1513394  PMID: 16643667
5.  An adaptive method for cDNA microarray normalization 
BMC Bioinformatics  2005;6:28.
Normalization is a critical step in analysis of gene expression profiles. For dual-labeled arrays, global normalization assumes that the majority of the genes on the array are non-differentially expressed between the two channels and that the number of over-expressed genes approximately equals the number of under-expressed genes. These assumptions can be inappropriate for custom arrays or arrays in which the reference RNA is very different from the experimental samples.
We propose a mixture model based normalization method that adaptively identifies non-differentially expressed genes and thereby substantially improves normalization for dual-labeled arrays in settings where the assumptions of global normalization are problematic. The new method is evaluated using both simulated and real data.
The new normalization method is effective for general microarray platforms when samples with very different expression profile are co-hybridized and for custom arrays where the majority of genes are likely to be differentially expressed.
PMCID: PMC552315  PMID: 15707486

Results 1-5 (5)