1.  Intra-tumor Genetic Heterogeneity and Mortality in Head and Neck Cancer: Analysis of Data from The Cancer Genome Atlas 
PLoS Medicine  2015;12(2):e1001786.
Although the involvement of intra-tumor genetic heterogeneity in tumor progression, treatment resistance, and metastasis is established, genetic heterogeneity is seldom examined in clinical trials or practice. Many studies of heterogeneity have had prespecified markers for tumor subpopulations, limiting their generalizability, or have involved massive efforts such as separate analysis of hundreds of individual cells, limiting their clinical use. We recently developed a general measure of intra-tumor genetic heterogeneity based on whole-exome sequencing (WES) of bulk tumor DNA, called mutant-allele tumor heterogeneity (MATH). Here, we examine data collected as part of a large, multi-institutional study to validate this measure and determine whether intra-tumor heterogeneity is itself related to mortality.
Methods and Findings
Clinical and WES data were obtained from The Cancer Genome Atlas in October 2013 for 305 patients with head and neck squamous cell carcinoma (HNSCC), from 14 institutions. Initial pathologic diagnoses were between 1992 and 2011 (median, 2008). Median time to death for 131 deceased patients was 14 mo; median follow-up of living patients was 22 mo. Tumor MATH values were calculated from WES results. Despite the multiple head and neck tumor subsites and the variety of treatments, we found in this retrospective analysis a substantial relation of high MATH values to decreased overall survival (Cox proportional hazards analysis: hazard ratio for high/low heterogeneity, 2.2; 95% CI 1.4 to 3.3). This relation of intra-tumor heterogeneity to survival was not due to intra-tumor heterogeneity’s associations with other clinical or molecular characteristics, including age, human papillomavirus status, tumor grade and TP53 mutation, and N classification. MATH improved prognostication over that provided by traditional clinical and molecular characteristics, maintained a significant relation to survival in multivariate analyses, and distinguished outcomes among patients having oral-cavity or laryngeal cancers even when standard disease staging was taken into account. Prospective studies, however, will be required before MATH can be used prognostically in clinical trials or practice. Such studies will need to examine homogeneously treated HNSCC at specific head and neck subsites, and determine the influence of cancer therapy on MATH values. Analysis of MATH and outcome in human-papillomavirus-positive oropharyngeal squamous cell carcinoma is particularly needed.
To our knowledge this study is the first to combine data from hundreds of patients, treated at multiple institutions, to document a relation between intra-tumor heterogeneity and overall survival in any type of cancer. We suggest applying the simply calculated MATH metric of heterogeneity to prospective studies of HNSCC and other tumor types.
In this study, Rocco and colleagues examine data collected as part of a large, multi-institutional study, to validate a measure of tumor heterogeneity called MATH and determine whether intra-tumor heterogeneity is itself related to mortality.
Editors’ Summary
Normally, the cells in human tissues and organs only reproduce (a process called cell division) when new cells are needed for growth or to repair damaged tissues. But sometimes a cell somewhere in the body acquires a genetic change (mutation) that disrupts the control of cell division and allows the cell to grow continuously. As the mutated cell grows and divides, it accumulates additional mutations that allow it to grow even faster and eventually from a lump, or tumor (cancer). Other mutations subsequently allow the tumor to spread around the body (metastasize) and destroy healthy tissues. Tumors can arise anywhere in the body—there are more than 200 different types of cancer—and about one in three people will develop some form of cancer during their lifetime. Many cancers can now be successfully treated, however, and people often survive for years after a diagnosis of cancer before, eventually, dying from another disease.
Why Was This Study Done?
The gradual acquisition of mutations by tumor cells leads to the formation of subpopulations of cells, each carrying a different set of mutations. This “intra-tumor heterogeneity” can produce tumor subclones that grow particularly quickly, that metastasize aggressively, or that are resistant to cancer treatments. Consequently, researchers have hypothesized that high intra-tumor heterogeneity leads to worse clinical outcomes and have suggested that a simple measure of this heterogeneity would be a useful addition to the cancer staging system currently used by clinicians for predicting the likely outcome (prognosis) of patients with cancer. Here, the researchers investigate whether a measure of intra-tumor heterogeneity called “mutant-allele tumor heterogeneity” (MATH) is related to mortality (death) among patients with head and neck squamous cell carcinoma (HNSCC)—cancers that begin in the cells that line the moist surfaces inside the head and neck, such as cancers of the mouth and the larynx (voice box). MATH is based on whole-exome sequencing (WES) of tumor and matched normal DNA. WES uses powerful DNA-sequencing systems to determine the variations of all the coding regions (exons) of the known genes in the human genome (genetic blueprint).
What Did the Researchers Do and Find?
The researchers obtained clinical and WES data for 305 patients who were treated in 14 institutions, primarily in the US, after diagnosis of HNSCC from The Cancer Genome Atlas, a catalog established by the US National Institutes of Health to map the key genomic changes in major types and subtypes of cancer. They calculated tumor MATH values for the patients from their WES results and retrospectively analyzed whether there was an association between the MATH values and patient survival. Despite the patients having tumors at various subsites and being given different treatments, every 10% increase in MATH value corresponded to an 8.8% increased risk (hazard) of death. Using a previously defined MATH-value cutoff to distinguish high- from low-heterogeneity tumors, compared to patients with low-heterogeneity tumors, patients with high-heterogeneity tumors were more than twice as likely to die (a hazard ratio of 2.2). Other statistical analyses indicated that MATH provided improved prognostic information compared to that provided by established clinical and molecular characteristics and human papillomavirus (HPV) status (HPV-positive HNSCC at some subsites has a better prognosis than HPV-negative HNSCC). In particular, MATH provided prognostic information beyond that provided by standard disease staging among patients with mouth or laryngeal cancers.
What Do These Findings Mean?
By using data from more than 300 patients treated at multiple institutions, these findings validate the use of MATH as a measure of intra-tumor heterogeneity in HNSCC. Moreover, they provide one of the first large-scale demonstrations that intra-tumor heterogeneity is clinically important in the prognosis of any type of cancer. Before the MATH metric can be used in clinical trials or in clinical practice as a prognostic tool, its ability to predict outcomes needs to be tested in prospective studies that examine the relation between MATH and the outcomes of patients with identically treated HNSCC at specific head and neck subsites, that evaluate the use of MATH for prognostication in other tumor types, and that determine the influence of cancer treatments on MATH values. Nevertheless, these findings suggest that MATH should be considered as a biomarker for survival in HNSCC and other tumor types, and raise the possibility that clinicians could use MATH values to decide on the best treatment for individual patients and to choose patients for inclusion in clinical trials.
PMCID: PMC4323109  PMID: 25668320
2.  Survival-Related Profile, Pathways, and Transcription Factors in Ovarian Cancer 
PLoS Medicine  2009;6(2):e1000024.
Ovarian cancer has a poor prognosis due to advanced stage at presentation and either intrinsic or acquired resistance to classic cytotoxic drugs such as platinum and taxoids. Recent large clinical trials with different combinations and sequences of classic cytotoxic drugs indicate that further significant improvement in prognosis by this type of drugs is not to be expected. Currently a large number of drugs, targeting dysregulated molecular pathways in cancer cells have been developed and are introduced in the clinic. A major challenge is to identify those patients who will benefit from drugs targeting these specific dysregulated pathways.The aims of our study were (1) to develop a gene expression profile associated with overall survival in advanced stage serous ovarian cancer, (2) to assess the association of pathways and transcription factors with overall survival, and (3) to validate our identified profile and pathways/transcription factors in an independent set of ovarian cancers.
Methods and Findings
According to a randomized design, profiling of 157 advanced stage serous ovarian cancers was performed in duplicate using ∼35,000 70-mer oligonucleotide microarrays. A continuous predictor of overall survival was built taking into account well-known issues in microarray analysis, such as multiple testing and overfitting. A functional class scoring analysis was utilized to assess pathways/transcription factors for their association with overall survival. The prognostic value of genes that constitute our overall survival profile was validated on a fully independent, publicly available dataset of 118 well-defined primary serous ovarian cancers. Furthermore, functional class scoring analysis was also performed on this independent dataset to assess the similarities with results from our own dataset. An 86-gene overall survival profile discriminated between patients with unfavorable and favorable prognosis (median survival, 19 versus 41 mo, respectively; permutation p-value of log-rank statistic = 0.015) and maintained its independent prognostic value in multivariate analysis. Genes that composed the overall survival profile were also able to discriminate between the two risk groups in the independent dataset. In our dataset 17/167 pathways and 13/111 transcription factors were associated with overall survival, of which 16 and 12, respectively, were confirmed in the independent dataset.
Our study provides new clues to genes, pathways, and transcription factors that contribute to the clinical outcome of serous ovarian cancer and might be exploited in designing new treatment strategies.
Ate van der Zee and colleagues analyze the gene expression profiles of ovarian cancer samples from 157 patients, and identify an 86-gene expression profile that seems to predict overall survival.
Editors' Summary
Ovarian cancer kills more than 100,000 women every year and is one of the most frequent causes of cancer death in women in Western countries. Most ovarian cancers develop when an epithelial cell in one of the ovaries (two small organs in the pelvis that produce eggs) acquires genetic changes that allow it to grow uncontrollably and to spread around the body (metastasize). In its early stages, ovarian cancer is confined to the ovaries and can often be treated successfully by surgery alone. Unfortunately, early ovarian cancer rarely has symptoms so a third of women with ovarian cancer have advanced disease when they first visit their doctor with symptoms that include vague abdominal pains and mild digestive disturbances. That is, cancer cells have spread into their abdominal cavity and metastasized to other parts of the body (so-called stage III and IV disease). The outlook for women diagnosed with stage III and IV disease, which are treated with a combination of surgery and chemotherapy, is very poor. Only 30% of women with stage III, and 5% with stage IV, are still alive five years after their cancer is diagnosed.
Why Was This Study Done?
If the cellular pathways that determine the biological behavior of ovarian cancer could be identified, it might be possible to develop more effective treatments for women with stage III and IV disease. One way to identify these pathways is to use gene expression profiling (a technique that catalogs all the genes expressed by a cell) to compare gene expression patterns in the ovarian cancers of women who survive for different lengths of time. Genes with different expression levels in tumors with different outcomes could be targets for new treatments. For example, it might be worth developing inhibitors of proteins whose expression is greatest in tumors with short survival times. In this study, the researchers develop an expression profile that is associated with overall survival in advanced-stage serous ovarian cancer (more than half of ovarian cancers originate in serous cells, epithelial cells that secrete a watery fluid). The researchers also assess the association of various cellular pathways and transcription factors (proteins that control the expression of other proteins) with survival in this type of ovarian carcinoma.
What Did the Researchers Do and Find?
The researchers analyzed the gene expression profiles of tumor samples taken from 157 patients with advanced stage serous ovarian cancer and used the “supervised principal components” method to build a predictor of overall survival from these profiles and patient survival times. This 86-gene predictor discriminated between patients with favorable and unfavorable outcomes (average survival times of 41 and 19 months, respectively). It also discriminated between groups of patients with these two outcomes in an independent dataset collected from 118 additional serous ovarian cancers. Next, the researchers used “functional class scoring” analysis to assess the association between pathway and transcription factor expression in the tumor samples and overall survival. Seventeen of 167 KEGG pathways (“wiring” diagrams of molecular interactions, reactions and relations involved in cellular processes and human diseases listed in the Kyoto Encyclopedia of Genes and Genomes) were associated with survival, 16 of which were confirmed in the independent dataset. Finally, 13 of 111 analyzed transcription factors were associated with overall survival in the tumor samples, 12 of which were confirmed in the independent dataset.
What Do These Findings Mean?
These findings identify an 86-gene overall survival gene expression profile that seems to predict overall survival for women with advanced serous ovarian cancer. However, before this profile can be used clinically, further validation of the profile and more robust methods for determining gene expression profiles are needed. Importantly, these findings also provide new clues about the genes, pathways and transcription factors that contribute to the clinical outcome of serous ovarian cancer, clues that can now be exploited in the search for new treatment strategies. Finally, these findings suggest that it might eventually be possible to tailor therapies to the needs of individual patients by analyzing which pathways are activated in their tumors and thus improve survival times for women with advanced ovarian cancer.
PMCID: PMC2634794  PMID: 19192944
3.  Rank Order Entropy: why one metric is not enough 
The use of Quantitative Structure-Activity Relationship models to address problems in drug discovery has a mixed history, generally resulting from the mis-application of QSAR models that were either poorly constructed or used outside of their domains of applicability. This situation has motivated the development of a variety of model performance metrics (r2, PRESS r2, F-tests, etc) designed to increase user confidence in the validity of QSAR predictions. In a typical workflow scenario, QSAR models are created and validated on training sets of molecules using metrics such as Leave-One-Out or many-fold cross-validation methods that attempt to assess their internal consistency. However, few current validation methods are designed to directly address the stability of QSAR predictions in response to changes in the information content of the training set. Since the main purpose of QSAR is to quickly and accurately estimate a property of interest for an untested set of molecules, it makes sense to have a means at hand to correctly set user expectations of model performance. In fact, the numerical value of a molecular prediction is often less important to the end user than knowing the rank order of that set of molecules according to their predicted endpoint values. Consequently, a means for characterizing the stability of predicted rank order is an important component of predictive QSAR. Unfortunately, none of the many validation metrics currently available directly measure the stability of rank order prediction, making the development of an additional metric that can quantify model stability a high priority. To address this need, this work examines the stabilities of QSAR rank order models created from representative data sets, descriptor sets, and modeling methods that were then assessed using Kendall Tau as a rank order metric, upon which the Shannon Entropy was evaluated as a means of quantifying rank-order stability. Random removal of data from the training set, also known as Data Truncation Analysis (DTA), was used as a means for systematically reducing the information content of each training set while examining both rank order performance and rank order stability in the face of training set data loss. The premise for DTA ROE model evaluation is that the response of a model to incremental loss of training information will be indicative of the quality and sufficiency of its training set, learning method, and descriptor types to cover a particular domain of applicability.
This process is termed a “rank order entropy” evaluation, or ROE. By analogy with information theory, an unstable rank order model displays a high level of implicit entropy, while a QSAR rank order model which remains nearly unchanged during training set reductions would show low entropy. In this work, the ROE metric was applied to 71 data sets of different sizes, and was found to reveal more information about the behavior of the models than traditional metrics alone. Stable, or consistently performing models, did not necessarily predict rank order well. Models that performed well in rank order did not necessarily perform well in traditional metrics. In the end, it was shown that ROE metrics suggested that some QSAR models that are typically used should be discarded. ROE evaluation helps to discern which combinations of data set, descriptor set, and modeling methods lead to usable models in prioritization schemes, and provides confidence in the use of a particular model within a specific domain of applicability.
PMCID: PMC3428235  PMID: 21875058
4.  A Biosurveillance-driven Home Score to Guide Strep Pharyngitis Treatment 
1. To derive and validate an accurate clinical prediction model (“home score”) to estimate a patient’s risk of group A streptococcal (GAS) pharyngitis before a health care visit based only on history and real-time local biosurveillance, and to compare its accuracy to traditional clinical prediction models composed of history and physical exam features. 2. To examine the impact of a home score on patient and public health outcomes.
GAS pharyngitis affects hundreds of millions of individuals globally each year, and over 12 million seek care in the United States annually for sore throat. Clinicians cannot differentiate GAS from other causes of acute pharyngitis based on the oropharynx exam, so consensus guidelines recommend use of clinical scores to classify GAS risk and guide management of adults with acute pharyngitis. When the clinical score is low, consensus guidelines agree patients should neither be tested nor treated for GAS. A prediction model that could identify very-low risk patients prior to an ambulatory visit could reduce low-yield, unnecessary visits for a most common outpatient condition. We recently showed that real-time biosurveillance can further identify patients at low-risk of GAS. With increasing emphasis on patient-centric health care and the well-documented barriers impeding clinicians’ incorporation of prediction models into medical practice, this presents an opportunity to create a patient-centric model for GAS pharyngitis based on history and recent local epidemiology. We refer to this model as the “home score,” because it is designed for use prior to a physical exam.
Analysis of data collected from 110,208 patients 3 years and older who presented with pharyngitis to a national retail health chain, from 2006–08. Practitioners collected standardized historical and physical exam information based on algorithm-driven care, and all patients with pharyngitis were tested for GAS. We used a previously validated biosurveillance variable reflecting disease incidence called recent local proportion positive (RLPP), which represents the proportion of patients who tested GAS positive in a local market in the previous 14 days. To derive the “home score,” candidate variables were restricted to demographic factors, historical items and the RLPP, while physical exam variables (such as exudate), were excluded. Multivariate analytic techniques were used to identify predictors of GAS. For each home score (0–100), we calculated the percent of patients who tested positive, and we examined the relationship between the home score and GAS positivity. Standard metrics (sensitivity, specificity, positive and negative predictive value, and AUC) were used to compare the performance of the home score to standard scores. We computed the number of patients aged >= 15 years who, according to the home score, were at low risk for GAS, and therefore might avoid or delay a trip to a medical provider. Outcomes included the numbers of reduced visits and the number of additional missed GAS cases compared to the standard Centor score approach (Do not test/Do not treat if Centor score is 0–1). To facilitate comparison across different risk thresholds, we calculated outcomes for hypothetical cohorts of 1000 patients, and extrapolated these findings to provide the impact on 12 million annual national pharyngitis visits.
The 3 best predictors were fever (OR 2.43, 95%CI 2.33–2.54), absence of cough (1.71,1.63–1.80) and RLPP (1.04,1.04–1.04 per unit change in RLPP). Using a home score cutoff of 0.10 to identify adults at low risk would save 230,000 ambulatory visits annually while missing only 8500 additional GAS cases. At a 0.20 cutoff, 2.9 million visits would be saved, and 320,000 more cases missed each year. There was a strong correlation between the percent testing positive and the home score (r-square=0.98). As the home score increases, there is a linear increase in the risk of GAS (slope=1.02). The home score AUC was 0.66, approaching the Centor score (0.69) even without any physical exam information.
A biosurveillance-driven home score to guide treatment of strep pharyngitis could save millions of visits annually by identifying patients in the pre-visit setting who would be unlikely to be tested or treated.
PMCID: PMC3692866
biosurveillance; pharyngitis; retail health
5.  Residual Serum Monoclonal Protein Predicts Progression-Free Survival in Patients With Previously Untreated Multiple Myeloma 
Cancer  2010;116(3):640-646.
Currently used treatment response criteria in multiple myeloma (MM) are based in part on serum monoclonal protein (M-protein) measurements. A drawback of these criteria is that response is determined solely by the best level of M-protein reduction, without considering the serial trend. The authors hypothesized that metrics incorporating the serial trend of M-protein would be better predictors of progression-free survival (PFS).
Fifty-five patients with measurable disease at baseline (M-protein ≥1 g/dL) who received ≥4 cycles of treatment from 2 clinical trials in previously untreated MM were included. Three metrics based on the percentage of M-protein remaining relative to baseline (residual M-protein) were considered: metrics based on the number of times residual M-protein fell within prespecified thresholds, metrics based on area under the residual M-protein curve, and metrics based on the average residual M-protein reduction between Cycles 1 and 4. The predictive value of these metrics was assessed in Cox models using landmark analysis.
The average residual M-protein reduction was found to be significantly predictive of PFS (P = .02; hazard ratio, 0.37), in which a patient with a 10% lower average residual M-protein reduction from Cycle 1 to 4 was estimated to be at least 2.7× more likely to develop disease progression or die early. None of the other metrics was predictive of PFS. The concordance index for the average residual M-protein reduction was 0.63, compared with 0.56 for best response.
The average residual M-protein reduction metric is promising and needs further validation. This exploratory analysis is the first step in the search for treatment-based trend metrics predictive of outcomes in MM.
PMCID: PMC2905541  PMID: 19924791
multiple myeloma; prediction; progression-free survival; response; serum monoclonal protein
6.  Assessing the performance of prediction models: a framework for some traditional and novel measures 
Epidemiology (Cambridge, Mass.)  2010;21(1):128-138.
The performance of prediction models can be assessed using a variety of different methods and metrics. Traditional measures for binary and survival outcomes include the Brier score to indicate overall model performance, the concordance (or c) statistic for discriminative ability (or area under the receiver operating characteristic (ROC) curve), and goodness-of-fit statistics for calibration.
Several new measures have recently been proposed that can be seen as refinements of discrimination measures, including variants of the c statistic for survival, reclassification tables, net reclassification improvement (NRI), and integrated discrimination improvement (IDI). Moreover, decision–analytic measures have been proposed, including decision curves to plot the net benefit achieved by making decisions based on model predictions.
We aimed to define the role of these relatively novel approaches in the evaluation of the performance of prediction models. For illustration we present a case study of predicting the presence of residual tumor versus benign tissue in patients with testicular cancer (n=544 for model development, n=273 for external validation).
We suggest that reporting discrimination and calibration will always be important for a prediction model. Decision-analytic measures should be reported if the predictive model is to be used for making clinical decisions. Other measures of performance may be warranted in specific applications, such as reclassification metrics to gain insight into the value of adding a novel predictor to an established model.
PMCID: PMC3575184  PMID: 20010215
7.  Comparison of continuous versus categorical tumor measurement-based metrics to predict overall survival in cancer treatment trials 
The categorical definition of response assessed via the Response Evaluation Criteria in Solid Tumors has documented limitations. We sought to identify alternative metrics for tumor response that improve prediction of overall survival.
Experimental Design
Individual patient data from three North Central Cancer Treatment Group trials (N0026, n=117; N9741, n=1109; N9841, n=332) were used. Continuous metrics of tumor size based on longitudinal tumor measurements were considered in addition to a trichotomized response (TriTR: Response vs. Stable vs. Progression). Cox proportional hazards models, adjusted for treatment arm and baseline tumor burden, were used to assess the impact of the metrics on subsequent overall survival, using a landmark analysis approach at 12-, 16- and 24-weeks post baseline. Model discrimination was evaluated using the concordance (c) index.
The overall best response rates for the three trials were 26%, 45%, and 25% respectively. While nearly all metrics were statistically significantly associated with overall survival at the different landmark time points, the c-indices for the traditional response metrics ranged from 0.59-0.65; for the continuous metrics from 0.60-0.66 and for the TriTR metrics from 0.64-0.69. The c-indices for TriTR at 12-weeks were comparable to those at 16- and 24-weeks.
Continuous tumor-measurement-based metrics provided no predictive improvement over traditional response based metrics or TriTR; TriTR had better predictive ability than best TriTR or confirmed response. If confirmed, TriTR represents a promising endpoint for future Phase II trials.
PMCID: PMC3195893  PMID: 21880789
continuous; tumor measurement; RECIST; prediction; survival
8.  Threshold Haemoglobin Levels and the Prognosis of Stable Coronary Disease: Two New Cohorts and a Systematic Review and Meta-Analysis 
PLoS Medicine  2011;8(5):e1000439.
Anoop Shah and colleagues performed a retrospective cohort study and a systematic review, and show evidence that in people with stable coronary disease there were threshold hemoglobin values below which mortality increased in a graded, continuous fashion.
Low haemoglobin concentration has been associated with adverse prognosis in patients with angina and myocardial infarction (MI), but the strength and shape of the association and the presence of any threshold has not been precisely evaluated.
Methods and findings
A retrospective cohort study was carried out using the UK General Practice Research Database. 20,131 people with a new diagnosis of stable angina and no previous acute coronary syndrome, and 14,171 people with first MI who survived for at least 7 days were followed up for a mean of 3.2 years. Using semi-parametric Cox regression and multiple adjustment, there was evidence of threshold haemoglobin values below which mortality increased in a graded continuous fashion. For men with MI, the threshold value was 13.5 g/dl (95% confidence interval [CI] 13.2–13.9); the 29.5% of patients with haemoglobin below this threshold had an associated hazard ratio for mortality of 2.00 (95% CI 1.76–2.29) compared to those with haemoglobin values in the lowest risk range. Women tended to have lower threshold haemoglobin values (e.g, for MI 12.8 g/dl; 95% CI 12.1–13.5) but the shape and strength of association did not differ between the genders, nor between patients with angina and MI. We did a systematic review and meta-analysis that identified ten previously published studies, reporting a total of only 1,127 endpoints, but none evaluated thresholds of risk.
There is an association between low haemoglobin concentration and increased mortality. A large proportion of patients with coronary disease have haemoglobin concentrations below the thresholds of risk defined here. Intervention trials would clarify whether increasing the haemoglobin concentration reduces mortality.
Please see later in the article for the Editors' Summary
Editors' Summary
Coronary artery disease is the main cause of death in high-income countries and the second most common cause of death in middle- and low-income countries, accounting for 16.3%, 13.9%, and 9.4% of all deaths, respectively, in 2004. Many risks factors, such as high blood pressure and high blood cholesterol level, are known to be associated with coronary artery disease, and prevention and treatment of such factors remains one of the key strategies in the management of coronary artery disease. Recent studies have suggested that low hemoglobin may be associated with mortality in patients with coronary artery disease. Therefore, using blood hemoglobin level as a prognostic biomarker for patients with stable coronary artery disease may be of potential benefit especially as measurement of hemoglobin is almost universal in such patients and there are available interventions that effectively increase hemoglobin concentration.
Why was This Study Done?
Much more needs to be understood about the relationship between low hemoglobin and coronary artery disease before hemoglobin levels can potentially be used as a clinical prognostic biomarker. Previous studies have been limited in their ability to describe the shape of this relationship—which means that it is uncertain whether there is a “best” hemoglobin threshold or a continuous graded relationship from “good” to “bad”—to assess gender differences, and to compare patients with angina or who have experienced previous myocardial infarction. In order to inform these knowledge gaps, the researchers conducted a retrospective analysis of patients from a prospective observational cohort as well as a systematic review and meta-analysis (statistical analysis) of previous studies.
What Did the Researchers Do and Find?
The researchers conducted a systematic review and meta-analysis of previous studies and found ten relevant studies, but none evaluated thresholds of risk, only linear relationships.
The researchers carried out a new study using the UK's General Practice Research Database—a national research tool that uses anonymized electronic clinical records of a representative sample of the UK population, with details of consultations, diagnoses, referrals, prescriptions, and test results—as the basis for their analysis. They identified and collected information from two cohorts of patients: those with new onset stable angina and no previous acute coronary syndrome; and those with a first myocardial infarction (heart attack). For these patients, the researchers also looked at all values of routinely recorded blood parameters (including hemoglobin) and information on established cardiovascular risk factors, such as smoking. The researchers followed up patients using death of any cause as a primary endpoint and put this data into a statistical model to identify upper and lower thresholds of an optimal hemoglobin range beyond which mortality risk increased.
The researchers found that there was a threshold hemoglobin value below which mortality continuously increased in a graded manner. For men with myocardial infarction, the threshold value was 13.5 g/dl: 29.5% of patients had hemoglobin below this threshold and had a hazard ratio for mortality of 2.00 compared to those with hemoglobin values in the lowest risk range. Women had a lower threshold hemoglobin value than men: 12.8 g/dl for women with myocardial infarction, but the shape and strength of association did not differ between the genders, or between patients with angina and myocardial infarction.
What Do These Findings Mean?
These findings suggest that there are thresholds of hemoglobin that are associated with increased risk of mortality in patients with angina or myocardial infarction. A substantial proportion of patients (15%–30%) have a hemoglobin level that places them at markedly higher risk of death compared to patients with lowest risk hemoglobin levels and importantly, these thresholds are higher than clinicians might anticipate—and are remarkably similar to World Health Organization anemia thresholds of 12 g/dl for women and 13 g/dl for men. Despite the limitations of these observational findings, this study supports the rationale for conducting future randomized controlled trials to assess whether hemoglobin levels are causal and whether clinicians should intervene to increase hemoglobin levels, for example by oral iron supplementation.
9.  A Novel Model to Combine Clinical and Pathway-Based Transcriptomic Information for the Prognosis Prediction of Breast Cancer 
PLoS Computational Biology  2014;10(9):e1003851.
Breast cancer is the most common malignancy in women worldwide. With the increasing awareness of heterogeneity in breast cancers, better prediction of breast cancer prognosis is much needed for more personalized treatment and disease management. Towards this goal, we have developed a novel computational model for breast cancer prognosis by combining the Pathway Deregulation Score (PDS) based pathifier algorithm, Cox regression and L1-LASSO penalization method. We trained the model on a set of 236 patients with gene expression data and clinical information, and validated the performance on three diversified testing data sets of 606 patients. To evaluate the performance of the model, we conducted survival analysis of the dichotomized groups, and compared the areas under the curve based on the binary classification. The resulting prognosis genomic model is composed of fifteen pathways (e.g. P53 pathway) that had previously reported cancer relevance, and it successfully differentiated relapse in the training set (log rank p-value = 6.25e-12) and three testing data sets (log rank p-value<0.0005). Moreover, the pathway-based genomic models consistently performed better than gene-based models on all four data sets. We also find strong evidence that combining genomic information with clinical information improved the p-values of prognosis prediction by at least three orders of magnitude in comparison to using either genomic or clinical information alone. In summary, we propose a novel prognosis model that harnesses the pathway-based dysregulation as well as valuable clinical information. The selected pathways in our prognosis model are promising targets for therapeutic intervention.
Author Summary
With the increasing awareness of heterogeneity in breast cancers, better prediction of breast cancer prognosis is much needed early on for more personalized treatment and management. Towards this goal we propose in this study a novel pathway-based prognosis prediction model, which emphasizes on individualized pathway-based risk measurement using the pathway dysregulation score (PDS). In combination with the L1-LASSO penalized feature selection and the COX-Proportional Hazards regression model, we have identified fifteen cancer relevant pathways using the pathway-based genomic model that successfully differentiated the relapse in the training set as well as three diversified test sets. Moreover, given the debate whether higher-order representative features, such as GO sets, pathways and network modules are superior to the gene-level features in the genomic models, we demonstrate that pathway-based genomic models consistently performed better than gene-based models in all four data sets. Last but not least, we show strong evidence that models that combine genomic information with clinical information improves the prognosis prediction significantly, in comparison to models that use either genomic or clinical information alone.
PMCID: PMC4168973  PMID: 25233347
10.  Improving Breast Cancer Survival Analysis through Competition-Based Multidimensional Modeling 
PLoS Computational Biology  2013;9(5):e1003047.
Breast cancer is the most common malignancy in women and is responsible for hundreds of thousands of deaths annually. As with most cancers, it is a heterogeneous disease and different breast cancer subtypes are treated differently. Understanding the difference in prognosis for breast cancer based on its molecular and phenotypic features is one avenue for improving treatment by matching the proper treatment with molecular subtypes of the disease. In this work, we employed a competition-based approach to modeling breast cancer prognosis using large datasets containing genomic and clinical information and an online real-time leaderboard program used to speed feedback to the modeling team and to encourage each modeler to work towards achieving a higher ranked submission. We find that machine learning methods combined with molecular features selected based on expert prior knowledge can improve survival predictions compared to current best-in-class methodologies and that ensemble models trained across multiple user submissions systematically outperform individual models within the ensemble. We also find that model scores are highly consistent across multiple independent evaluations. This study serves as the pilot phase of a much larger competition open to the whole research community, with the goal of understanding general strategies for model optimization using clinical and molecular profiling data and providing an objective, transparent system for assessing prognostic models.
Author Summary
We developed an extensible software framework for sharing molecular prognostic models of breast cancer survival in a transparent collaborative environment and subjecting each model to automated evaluation using objective metrics. The computational framework presented in this study, our detailed post-hoc analysis of hundreds of modeling approaches, and the use of a novel cutting-edge data resource together represents one of the largest-scale systematic studies to date assessing the factors influencing accuracy of molecular-based prognostic models in breast cancer. Our results demonstrate the ability to infer prognostic models with accuracy on par or greater than previously reported studies, with significant performance improvements by using state-of-the-art machine learning approaches trained on clinical covariates. Our results also demonstrate the difficultly in incorporating molecular data to achieve substantial performance improvements over clinical covariates alone. However, improvement was achieved by combining clinical feature data with intelligent selection of important molecular features based on domain-specific prior knowledge. We observe that ensemble models aggregating the information across many diverse models achieve among the highest scores of all models and systematically out-perform individual models within the ensemble, suggesting a general strategy for leveraging the wisdom of crowds to develop robust predictive models.
PMCID: PMC3649990  PMID: 23671412
11.  Performance assessment for radiologists interpreting screening mammography 
Statistics in medicine  2007;26(7):1532-1551.
When interpreting screening mammograms radiologists decide whether suspicious abnormalities exist that warrant the recall of the patient for further testing. Previous work has found significant differences in interpretation among radiologists; their false-positive and false-negative rates have been shown to vary widely. Performance assessments of individual radiologists have been mandated by the U.S. government, but concern exists about the adequacy of current assessment techniques.
We use hierarchical modelling techniques to infer about interpretive performance of individual radiologists in screening mammography. While doing this we account for differences due to patient mix and radiologist attributes (for instance, years of experience or interpretive volume). We model at the mammogram level, and then use these models to assess radiologist performance. Our approach is demonstrated with data from mammography registries and radiologist surveys. For each mammogram, the registries record whether or not the woman was found to have breast cancer within one year of the mammogram; this criterion is used to determine whether the recall decision was correct.
We model the false-positive rate and the false-negative rate separately using logistic regression on patient risk factors and radiologist random effects. The radiologist random effects are, in turn, regressed on radiologist attributes such as the number of years in practice.
Using these Bayesian hierarchical models we examine several radiologist performance metrics. The first is the difference between the false-positive or false-negative rate of a particular radiologist and that of a hypothetical 'standard' radiologist with the same attributes and the same patient mix. A second metric predicts the performance of each radiologist on hypothetical mammography exams with particular combinations of patient risk factors (which we characterize as 'typical', 'high-risk', or 'low-risk'). The second metric can be used to compare one radiologist to another, while the first metric addresses how the radiologist is performing compared to an appropriate standard. Interval estimates are given for the metrics, thereby addressing uncertainty.
The particular novelty in our contribution is to estimate multiple performance rates (sensitivity and specificity). One can even estimate a continuum of performance rates such as a performance curve or ROC curve using our models and we describe how this may be done. In addition to assessing radiologists in the original data set, we also show how to infer about the performance of a new radiologist with new case mix, new outcome data, and new attributes without having to refit the model.
PMCID: PMC3152258  PMID: 16847870
hierarchical; mammography; sensitivity; specificity
12.  Shifting from Population-wide to Personalized Cancer Prognosis with Microarrays 
PLoS ONE  2012;7(1):e29534.
The era of personalized medicine for cancer therapeutics has taken an important step forward in making accurate prognoses for individual patients with the adoption of high-throughput microarray technology. However, microarray technology in cancer diagnosis or prognosis has been primarily used for the statistical evaluation of patient populations, and thus excludes inter-individual variability and patient-specific predictions. Here we propose a metric called clinical confidence that serves as a measure of prognostic reliability to facilitate the shift from population-wide to personalized cancer prognosis using microarray-based predictive models. The performance of sample-based models predicted with different clinical confidences was evaluated and compared systematically using three large clinical datasets studying the following cancers: breast cancer, multiple myeloma, and neuroblastoma. Survival curves for patients, with different confidences, were also delineated. The results show that the clinical confidence metric separates patients with different prediction accuracies and survival times. Samples with high clinical confidence were likely to have accurate prognoses from predictive models. Moreover, patients with high clinical confidence would be expected to live for a notably longer or shorter time if their prognosis was good or grim based on the models, respectively. We conclude that clinical confidence could serve as a beneficial metric for personalized cancer prognosis prediction utilizing microarrays. Ascribing a confidence level to prognosis with the clinical confidence metric provides the clinician an objective, personalized basis for decisions, such as choosing the severity of the treatment.
PMCID: PMC3266237  PMID: 22295060
13.  A Gene Expression Signature Predicts Survival of Patients with Stage I Non-Small Cell Lung Cancer 
PLoS Medicine  2006;3(12):e467.
Lung cancer is the leading cause of cancer-related death in the United States. Nearly 50% of patients with stages I and II non-small cell lung cancer (NSCLC) will die from recurrent disease despite surgical resection. No reliable clinical or molecular predictors are currently available for identifying those at high risk for developing recurrent disease. As a consequence, it is not possible to select those high-risk patients for more aggressive therapies and assign less aggressive treatments to patients at low risk for recurrence.
Methods and Findings
In this study, we applied a meta-analysis of datasets from seven different microarray studies on NSCLC for differentially expressed genes related to survival time (under 2 y and over 5 y). A consensus set of 4,905 genes from these studies was selected, and systematic bias adjustment in the datasets was performed by distance-weighted discrimination (DWD). We identified a gene expression signature consisting of 64 genes that is highly predictive of which stage I lung cancer patients may benefit from more aggressive therapy. Kaplan-Meier analysis of the overall survival of stage I NSCLC patients with the 64-gene expression signature demonstrated that the high- and low-risk groups are significantly different in their overall survival. Of the 64 genes, 11 are related to cancer metastasis (APC, CDH8, IL8RB, LY6D, PCDHGA12, DSP, NID, ENPP2, CCR2, CASP8, and CASP10) and eight are involved in apoptosis (CASP8, CASP10, PIK3R1, BCL2, SON, INHA, PSEN1, and BIK).
Our results indicate that gene expression signatures from several datasets can be reconciled. The resulting signature is useful in predicting survival of stage I NSCLC and might be useful in informing treatment decisions.
Meta-analysis of several lung cancer gene expression studies yields a set of 64 genes whose expression profile is useful in predicting survival of patients with early-stage lung cancer and possibly informing treatment decisions.
Editors' Summary
Lung cancer is the commonest cause of cancer-related death worldwide. Most cases are of a type called non-small cell lung cancer (NSCLC) and are mainly caused by smoking. Like other cancers, how NSCLC is treated depends on the “stage” at which it is detected. Stage IA NSCLCs are small and confined to the lung and can be removed surgically; patients with slightly larger stage IB tumors often receive chemotherapy after surgery. In stage II NSCLC, cancer cells may be present in lymph nodes near the tumor. Surgery plus chemotherapy is the usual treatment for this stage and for some stage III NSCLCs. However, in this stage, the tumor can be present throughout the chest and surgery is not always possible. For such cases and in stage IV NSCLC, where the tumor has spread throughout the body, patients are treated with chemotherapy alone. The stage at which NSCLC is detected also determines how well patients respond to treatment. Those who can be treated surgically do much better than those who can't. So, whereas only 2% of patients with stage IV lung cancer survive for 5 years after diagnosis, about 70% of patients with stage I or II lung cancer live at least this long.
Why Was This Study Done?
Even stage I and II lung cancers often recur and there is no accurate way to identify the patients in which this will happen. If there was, these patients could be given aggressive chemotherapy, so the search is on for a “molecular signature” to help identify which NSCLCs are likely to recur. Unlike normal cells, cancer cells divide uncontrollably and can move around the body. These behavioral differences are caused by changes in their genetic material that alter their patterns of RNA transcription and protein expression. In this study, the researchers have investigated whether data from several microarray studies (a technique used to catalog the genes expressed in cells) can be pooled to construct a gene expression signature that predicts the survival of patients with stage I NSCLC.
What Did the Researchers Do and Find?
The researchers took the data from seven independent microarray studies (including a new study of their own) that recorded gene expression profiles related to survival time (less than 2 years and greater than 5 years) for stage I NSCLC. Because these studies had been done in different places with slightly different techniques, the researchers applied a statistical tool called distance-weighted discrimination to smooth out any systematic differences among the studies before identifying 64 genes whose expression was associated with survival. Most of these genes are involved in cell adhesion, cell motility, cell proliferation, and cell death, all processes that are altered in cancer cells. The researchers then developed a statistical model that allowed them to use the gene expression and survival data to calculate risk scores for nearly 200 patients in five of the datasets. When they separated the patients into high and low risk groups on the basis of these scores, the two groups were significantly different in terms of survival time. Indeed, the gene expression signature was better at predicting outcome than routine staging. Finally, the researchers validated the gene expression signature by showing that it predicted survival with more than 85% accuracy in two independent datasets.
What Do These Findings Mean?
The 64 gene expression signature identified here could help clinicians prepare treatment plans for patients with stage I NSCLC. Because it accurately predicts survival in patients with adenocarcinoma or squamous cell cancer (the two major subtypes of NSCLC), it potentially indicates which of these patients should receive aggressive chemotherapy and which can be spared this unpleasant treatment. Previous attempts to establish gene expression signatures to predict outcome have used data from small groups of patients and have failed when tested in additional patients. In contrast, this new signature seems to be generalizable. Nevertheless, its ability to predict outcomes must be confirmed in further studies before it is routinely adopted by oncologists for treatment planning.
PMCID: PMC1716187  PMID: 17194181
14.  Nuclear Receptor Expression Defines a Set of Prognostic Biomarkers for Lung Cancer 
PLoS Medicine  2010;7(12):e1000378.
David Mangelsdorf and colleagues show that nuclear receptor expression is strongly associated with clinical outcomes of lung cancer patients, and this expression profile is a potential prognostic signature for lung cancer patient survival time, particularly for individuals with early stage disease.
The identification of prognostic tumor biomarkers that also would have potential as therapeutic targets, particularly in patients with early stage disease, has been a long sought-after goal in the management and treatment of lung cancer. The nuclear receptor (NR) superfamily, which is composed of 48 transcription factors that govern complex physiologic and pathophysiologic processes, could represent a unique subset of these biomarkers. In fact, many members of this family are the targets of already identified selective receptor modulators, providing a direct link between individual tumor NR quantitation and selection of therapy. The goal of this study, which begins this overall strategy, was to investigate the association between mRNA expression of the NR superfamily and the clinical outcome for patients with lung cancer, and to test whether a tumor NR gene signature provided useful information (over available clinical data) for patients with lung cancer.
Methods and Findings
Using quantitative real-time PCR to study NR expression in 30 microdissected non-small-cell lung cancers (NSCLCs) and their pair-matched normal lung epithelium, we found great variability in NR expression among patients' tumor and non-involved lung epithelium, found a strong association between NR expression and clinical outcome, and identified an NR gene signature from both normal and tumor tissues that predicted patient survival time and disease recurrence. The NR signature derived from the initial 30 NSCLC samples was validated in two independent microarray datasets derived from 442 and 117 resected lung adenocarcinomas. The NR gene signature was also validated in 130 squamous cell carcinomas. The prognostic signature in tumors could be distilled to expression of two NRs, short heterodimer partner and progesterone receptor, as single gene predictors of NSCLC patient survival time, including for patients with stage I disease. Of equal interest, the studies of microdissected histologically normal epithelium and matched tumors identified expression in normal (but not tumor) epithelium of NGFIB3 and mineralocorticoid receptor as single gene predictors of good prognosis.
NR expression is strongly associated with clinical outcomes for patients with lung cancer, and this expression profile provides a unique prognostic signature for lung cancer patient survival time, particularly for those with early stage disease. This study highlights the potential use of NRs as a rational set of therapeutically tractable genes as theragnostic biomarkers, and specifically identifies short heterodimer partner and progesterone receptor in tumors, and NGFIB3 and MR in non-neoplastic lung epithelium, for future detailed translational study in lung cancer.
Please see later in the article for the Editors' Summary
Editors' Summary
Lung cancer, the most common cause of cancer-related death, kills 1.3 million people annually. Most lung cancers are “non-small-cell lung cancers” (NSCLCs), and most are caused by smoking. Exposure to chemicals in smoke causes changes in the genes of the cells lining the lungs that allow the cells to grow uncontrollably and to move around the body. How NSCLC is treated and responds to treatment depends on its “stage.” Stage I tumors, which are small and confined to the lung, are removed surgically, although chemotherapy is also sometimes given. Stage II tumors have spread to nearby lymph nodes and are treated with surgery and chemotherapy, as are some stage III tumors. However, because cancer cells in stage III tumors can be present throughout the chest, surgery is not always possible. For such cases, and for stage IV NSCLC, where the tumor has spread around the body, patients are treated with chemotherapy alone. About 70% of patients with stage I and II NSCLC but only 2% of patients with stage IV NSCLC survive for five years after diagnosis; more than 50% of patients have stage IV NSCLC at diagnosis.
Why Was This Study Done?
Patient responses to treatment vary considerably. Oncologists (doctors who treat cancer) would like to know which patients have a good prognosis (are likely to do well) to help them individualize their treatment. Consequently, the search is on for “prognostic tumor biomarkers,” molecules made by cancer cells that can be used to predict likely clinical outcomes. Such biomarkers, which may also be potential therapeutic targets, can be identified by analyzing the overall pattern of gene expression in a panel of tumors using a technique called microarray analysis and looking for associations between the expression of sets of genes and clinical outcomes. In this study, the researchers take a more directed approach to identifying prognostic biomarkers by investigating the association between the expression of the genes encoding nuclear receptors (NRs) and clinical outcome in patients with lung cancer. The NR superfamily contains 48 transcription factors (proteins that control the expression of other genes) that respond to several hormones and to diet-derived fats. NRs control many biological processes and are targets for several successful drugs, including some used to treat cancer.
What Did the Researchers Do and Find?
The researchers analyzed the expression of NR mRNAs using “quantitative real-time PCR” in 30 microdissected NSCLCs and in matched normal lung tissue samples (mRNA is the blueprint for protein production). They then used an approach called standard classification and regression tree analysis to build a prognostic model for NSCLC based on the expression data. This model predicted both survival time and disease recurrence among the patients from whom the tumors had been taken. The researchers validated their prognostic model in two large independent lung adenocarcinoma microarray datasets and in a squamous cell carcinoma dataset (adenocarcinomas and squamous cell carcinomas are two major NSCLC subtypes). Finally, they explored the roles of specific NRs in the prediction model. This analysis revealed that the ability of the NR signature in tumors to predict outcomes was mainly due to the expression of two NRs—the short heterodimer partner (SHP) and the progesterone receptor (PR). Expression of either gene could be used as a single gene predictor of the survival time of patients, including those with stage I disease. Similarly, the expression of either nerve growth factor induced gene B3 (NGFIB3) or mineralocorticoid receptor (MR) in normal tissue was a single gene predictor of a good prognosis.
What Do These Findings Mean?
These findings indicate that the expression of NR mRNA is strongly associated with clinical outcomes in patients with NSCLC. Furthermore, they identify a prognostic NR expression signature that provides information on the survival time of patients, including those with early stage disease. The signature needs to be confirmed in more patients before it can be used clinically, and researchers would like to establish whether changes in mRNA expression are reflected in changes in protein expression if NRs are to be targeted therapeutically. Nevertheless, these findings highlight the potential use of NRs as prognostic tumor biomarkers. Furthermore, they identify SHP and PR in tumors and two NRs in normal lung tissue as molecules that might provide new targets for the treatment of lung cancer and new insights into the early diagnosis, pathogenesis, and chemoprevention of lung cancer.
PMCID: PMC3001894  PMID: 21179495
15.  Optimal Management of High-Risk T1G3 Bladder Cancer: A Decision Analysis 
PLoS Medicine  2007;4(9):e284.
Controversy exists about the most appropriate treatment for high-risk superficial (stage T1; grade G3) bladder cancer. Immediate cystectomy offers the best chance for survival but may be associated with an impaired quality of life compared with conservative therapy. We estimated life expectancy (LE) and quality-adjusted life expectancy (QALE) for both of these treatments for men and women of different ages and comorbidity levels.
Methods and Findings
We evaluated two treatment strategies for high-risk, T1G3 bladder cancer using a decision-analytic Markov model: (1) Immediate cystectomy with neobladder creation versus (2) conservative management with intravesical bacillus Calmette-Guérin (BCG) and delayed cystectomy in individuals with resistant or progressive disease. Probabilities and utilities were derived from published literature where available, and otherwise from expert opinion. Extensive sensitivity analyses were conducted to identify variables most likely to influence the decision. Structural sensitivity analyses modifying the base case definition and the triggers for cystectomy in the conservative therapy arm were also explored. Probabilistic sensitivity analysis was used to assess the joint uncertainty of all variables simultaneously and the uncertainty in the base case results. External validation of model outputs was performed by comparing model-predicted survival rates with independent published literature. The mean LE of a 60-y-old male was 14.3 y for immediate cystectomy and 13.6 y with conservative management. With the addition of utilities, the immediate cystectomy strategy yielded a mean QALE of 12.32 y and remained preferred over conservative therapy by 0.35 y. Worsening patient comorbidity diminished the benefit of early cystectomy but altered the LE-based preferred treatment only for patients over age 70 y and the QALE-based preferred treatment for patients over age 65 y. Sensitivity analyses revealed that patients over the age of 70 y or those strongly averse to loss of sexual function, gastrointestinal dysfunction, or life without a bladder have a higher QALE with conservative therapy. The results of structural or probabilistic sensitivity analyses did not change the preferred treatment option. Model-predicted overall and disease-specific survival rates were similar to those reported in published studies, suggesting external validity.
Our model is, to our knowledge, the first of its kind in bladder cancer, and demonstrated that younger patients with high-risk T1G3 bladder had a higher LE and QALE with immediate cystectomy. The decision to pursue immediate cystectomy versus conservative therapy should be based on discussions that consider patient age, comorbid status, and an individual's preference for particular postcystectomy health states. Patients over the age of 70 y or those who place high value on sexual function, gastrointestinal function, or bladder preservation may benefit from a more conservative initial therapeutic approach.
Using a Markov model, Shabbir Alibhai and colleagues develop a decision analysis comparing cystectomy with conservative treatment for high-risk superficial bladder cancer depending on patient age, comorbid conditions, and preferences.
Editors' Summary
Every year, about 67,000 people in the US develop bladder cancer. Like all cancers, bladder cancer arises when a single cell begins to grow faster than normal, loses its characteristic shape, and moves into surrounding tissues. Most bladder cancers develop from cells that line the bladder (“transitional” cells) and most are detected before they spread out of this lining. These superficial or T1 stage cancers can be removed by transurethral resection of bladder tumor (TURBT). The urologist (a specialist who treats urinary tract problems) passes a small telescope into the bladder through the urethra (the tube through which urine leaves the body) and removes the tumor. If the tumor cells look normal under a microscope (so-called normal histology), the cancer is unlikely to return; if they have lost their normal appearance, the tumor is given a “G3” histological grade, which indicates a high risk of recurrence.
Why Was This Study Done?
The best treatment for T1G3 bladder cancer remains controversial. Some urologists recommend immediate radical cystectomy— surgical removal of the bladder, the urethra, and other nearby organs. This treatment often provides a complete cure but can cause serious short-term health problems and affects long-term quality of life. Patients often develop sexual dysfunction or intestinal (gut) problems and sometimes find it hard to live with a reconstructed bladder. The other recommended treatment is immunotherapy with bacillus Calmette-Guérin (BCG, bacteria that are also used to vaccinate against tuberculosis). Long-term survival is not always as good with this conservative treatment but it is less likely than surgery to cause short-term illness or to reduce quality of life. In this study, the researchers have used decision analysis (a systematic evaluation of the important factors affecting a decision) to determine whether immediate cystectomy or conservative therapy is the optimal treatment for patients with T1G3 bladder cancer. Decision analysis allowed the researchers to account for quality-of-life factors while comparing the health benefits of each treatment for T1G3 bladder cancer.
What Did the Researchers Do and Find?
Using a decision analysis model called a Markov model, the researchers calculated the months of life gained, and the quality of life expected to result, from each of the two treatments. To estimate the life expectancy (LE) associated with each treatment, the researchers incorporated the published probabilities of various outcomes of each treatment into their model. To estimate quality-adjusted life expectancy (QALE, the number of years of good quality life), they incorporated “utilities,” measures of relative satisfaction with outcomes. (A utility of 1 represents perfect health; death is assigned a value of 0, and outcomes considered less than ideal, but better than death, fall in between). For a sexually potent 60-year-old man with bladder cancer but no other illnesses, the average LE predicted by the model was nearly eight months longer with immediate cystectomy than with conservative treatment (both LEs predicted by this model matched those seen in clinical trials); the average QALE with cystectomy was 4.2 months longer than with conservative treatment. Having additional diseases decreased the benefit of immediate cystectomy but the treatment still gave a longer LE until the patient reached 70 years old, when conservative treatment became better. For QALE, this change in optimal treatment appeared at age 65. Finally, conservative treatment gave a higher QALE than immediate cystectomy for patients concerned about preserving sexual function or averse to living with intestinal problems or a reconstructed bladder.
What Do These Findings Mean?
As with all mathematical models, these results depend on the assumptions included in the model. In particular, because published probability and utility values are not available for some of the possible outcomes of the two treatments, the LE and QALE calculations could be inaccurate. Also, assigning numerical ratings to life experiences is generally something of a simplification, which could affect the reliability of the QALE (but not the LE) results. Nevertheless, these findings provide useful guidance for urologists trying to balance the benefits of immediate cystectomy or conservative treatment against the potential short-term and long-term effects of these treatments on patients' quality of life. Specifically, the results indicate that decisions on treatment for T1G3 bladder cancer should be based on a consideration of the patient's age and any coexisting disease coupled with detailed discussions with the patient about their attitudes regarding the possible health-related effects of cystectomy.
PMCID: PMC1989749  PMID: 17896857
16.  A simulation model of colorectal cancer surveillance and recurrence 
Approximately one-third of those treated curatively for colorectal cancer (CRC) will experience recurrence. No evidence-based consensus exists on how best to follow patients after initial treatment to detect asymptomatic recurrence. Here, a new approach for simulating surveillance and recurrence among CRC survivors is outlined, and development and calibration of a simple model applying this approach is described. The model’s ability to predict outcomes for a group of patients under a specified surveillance strategy is validated.
We developed an individual-based simulation model consisting of two interacting submodels: a continuous-time disease-progression submodel overlain by a discrete-time Markov submodel of surveillance and re-treatment. In the former, some patients develops recurrent disease which probabilistically progresses from detectability to unresectability, and which may produce early symptoms leading to detection independent of surveillance testing. In the latter submodel, patients undergo user-specified surveillance testing regimens. Parameters describing disease progression were preliminarily estimated through calibration to match five-year disease-free survival, overall survival at years 1–5, and proportion of recurring patients undergoing curative salvage surgery from one arm of a published randomized trial. The calibrated model was validated by examining its ability to predict these same outcomes for patients in a different arm of the same trial undergoing less aggressive surveillance.
Calibrated parameter values were consistent with generally observed recurrence patterns. Sensitivity analysis suggested probability of curative salvage surgery was most influenced by sensitivity of carcinoembryonic antigen assay and of clinical interview/examination (i.e. scheduled provider visits). In validation, the model accurately predicted overall survival (59% predicted, 58% observed) and five-year disease-free survival (55% predicted, 53% observed), but was less accurate in predicting curative salvage surgery (10% predicted; 6% observed).
Initial validation suggests the feasibility of this approach to modeling alternative surveillance regimens among CRC survivors. Further calibration to individual-level patient data could yield a model useful for predicting outcomes of specific surveillance strategies for risk-based subgroups or for individuals. This approach could be applied toward developing novel, tailored strategies for further clinical study. It has the potential to produce insights which will promote more effective surveillance—leading to higher cure rates for recurrent CRC.
PMCID: PMC4021538  PMID: 24708517
Colorectal cancer; Recurrence; Surveillance; Follow-up; Model
17.  Validation of the Modes of Transmission Model as a Tool to Prioritize HIV Prevention Targets: A Comparative Modelling Analysis 
PLoS ONE  2014;9(7):e101690.
The static Modes of Transmission (MOT) model predicts the annual fraction of new HIV infections acquired across subgroups (MOT metric), and is used to focus HIV prevention. Using synthetic epidemics via a dynamical model, we assessed the validity of the MOT metric for identifying epidemic drivers (behaviours or subgroups that are sufficient and necessary for HIV to establish and persist), and the potential consequence of MOT-guided policies.
Methods and Findings
To generate benchmark MOT metrics for comparison, we simulated three synthetic epidemics (concentrated, mixed, and generalized) with different epidemic drivers using a dynamical model of heterosexual HIV transmission. MOT metrics from generic and complex MOT models were compared against the benchmark, and to the contribution of epidemic drivers to overall HIV transmission (cumulative population attributable fraction over t years, PAFt). The complex MOT metric was similar to the benchmark, but the generic MOT underestimated the fraction of infections in epidemic drivers. The benchmark MOT metric identified epidemic drivers early in the epidemics. Over time, the MOT metric did not identify epidemic drivers. This was not due to simplified MOT models or biased parameters but occurred because the MOT metric (irrespective of the model used to generate it) underestimates the contribution of epidemic drivers to HIV transmission over time (PAF5–30). MOT-directed policies that fail to reach epidemic drivers could undermine long-term impact on HIV incidence, and achieve a similar impact as random allocation of additional resources.
Irrespective of how it is obtained, the MOT metric is not a valid stand-alone tool to identify epidemic drivers, and has limited additional value in guiding the prioritization of HIV prevention targets. Policy-makers should use the MOT model judiciously, in combination with other approaches, to identify epidemic drivers.
PMCID: PMC4090151  PMID: 25014543
18.  Quantitative Measurement of Melanoma Spread in Sentinel Lymph Nodes and Survival 
PLoS Medicine  2014;11(2):e1001604.
In this study, Klein and colleagues investigated the impact of minimal cancer sentinel lymph node spread and of increasing numbers of disseminated cancer cells on melanoma-specific survival. The authors found that cancer cell dissemination to the sentinel node is a quantitative risk factor for melanoma death and the best predictor of outcome was a model based on combined quantitative effects of DCCD, tumor thickness, and ulceration.
Please see later in the article for the Editors' Summary
Sentinel lymph node spread is a crucial factor in melanoma outcome. We aimed to define the impact of minimal cancer spread and of increasing numbers of disseminated cancer cells on melanoma-specific survival.
Methods and Findings
We analyzed 1,834 sentinel nodes from 1,027 patients with ultrasound node-negative melanoma who underwent sentinel node biopsy between February 8, 2000, and June 19, 2008, by histopathology including immunohistochemistry and quantitative immunocytology. For immunocytology we recorded the number of disseminated cancer cells (DCCs) per million lymph node cells (DCC density [DCCD]) after disaggregation and immunostaining for the melanocytic marker gp100. None of the control lymph nodes from non-melanoma patients (n = 52) harbored gp100-positive cells. We analyzed gp100-positive cells from melanoma patients by comparative genomic hybridization and found, in 45 of 46 patients tested, gp100-positive cells displaying genomic alterations. At a median follow-up of 49 mo (range 3–123 mo), 138 patients (13.4%) had died from melanoma. Increased DCCD was associated with increased risk for death due to melanoma (univariable analysis; p<0.001; hazard ratio 1.81, 95% CI 1.61–2.01, for a 10-fold increase in DCCD + 1). Even patients with a positive DCCD ≤3 had an increased risk of dying from melanoma compared to patients with DCCD = 0 (p = 0.04; hazard ratio 1.63, 95% CI 1.02–2.58). Upon multivariable testing DCCD was a stronger predictor of death than histopathology. The final model included thickness, DCCD, and ulceration (all p<0.001) as the most relevant prognostic factors, was internally validated by bootstrapping, and provided superior survival prediction compared to the current American Joint Committee on Cancer staging categories.
Cancer cell dissemination to the sentinel node is a quantitative risk factor for melanoma death. A model based on the combined quantitative effects of DCCD, tumor thickness, and ulceration predicted outcome best, particularly at longer follow-up. If these results are validated in an independent study, establishing quantitative immunocytology in histopathological laboratories may be useful clinically.
Please see later in the article for the Editors' Summary
Editors' Summary
Because the skin contains many different cell types, there are many types of skin cancer. The most dangerous type—melanoma—develops when mutations occur in melanocytes, the cells that produce the pigment melanin. Less than 5% of skin cancers are melanomas, but melanoma causes most skin cancer deaths. Early signs of melanoma are a change in the appearance of a mole (a pigmented skin blemish) or the development of a new and unusual pigmented lesion. If these signs are noticed and the melanoma is diagnosed before it has spread from the skin into nearby lymph nodes and other tissues, surgery often provides a cure. For advanced melanomas, the outlook is generally poor, although novel therapies may prolong a patient's life.
Why Was This Study Done?
When a person is diagnosed with melanoma, it is important to “stage” the tumor. Knowing the extent and severity of the melanoma helps oncologists plan treatments and estimate their patients' likely outcomes. The detection of isolated melanoma cells in sentinel lymph nodes (the nodes to which cancer cells are most likely to spread from a primary tumor) is included in melanoma staging recommendations. However, finding rare tumor cells in sentinel lymph node biopsies by examining the tissue requires the analysis of many slides from each node removed from the patient and is extremely time-consuming. In this study, the researchers investigate the predictive value of a quantitative immunocytological assay that involves disaggregation of the sentinel node and detection of disseminated cancer cells (DCCs) by immunostaining for gp100 (a marker for melanoma cells). They also use this new assay to examine the effect of increasing numbers of DCCs on melanoma-specific survival.
What Did the Researchers Do and Find?
The researchers used routine histopathology and immunocytology to analyze 1,834 sentinel lymph nodes from 1,027 patients with melanoma who underwent sentinel lymph node biopsy at one German hospital. For immunocytology, the researchers recorded the number of gp100-positive cells per million lymph node cells (the DCC density). During follow-up, 138 patients (13.4%) died from melanoma. The results indicated that increased DCC density was associated with an increased risk of death due to melanoma. Specifically, every 10-fold increase in DCC density + 1 was associated with a near doubling of the risk of death from melanoma (a hazard ratio of 1.81). Even patients with three or fewer gp100-positive cells per million lymph node cells had an increased risk of dying from melanoma compared to patients with no gp100-positive cells (hazard ratio 1.63). When other predictors of outcome such as age and primary tumor location were taken into account, DCC density was a stronger predictor of death than histopathology. Finally, a survival model that included tumor thickness, tumor ulceration, and DCC density provided survival prediction superior to that of a model based on the current standard staging recommendations.
What Do These Findings Mean?
These findings show that quantification of cancer cell dissemination from melanomas to sentinel lymph nodes is feasible and can be combined with other characteristics of the primary tumor to provide an accurate prediction of outcomes for individual patients with melanoma. Notably, the new prediction model identifies a group of patients at high risk of progression for whom the current clinical standard underestimates the risk of death. These patients may benefit from adjuvant therapies, so the new analysis presented in this study may help to stratify patients for clinical trials. Importantly, quantitative immunocytology and the new model, although internally validated in this study, need to be validated in an independent group of patients before they can be considered for routine clinical use. If external validation is successful, quantitative immunocytology, which is much less labor-intensive than histopathology, has the potential to change the routine clinical care of patients with melanoma and probably with other solid tumors, conclude the researchers.
PMCID: PMC3928050  PMID: 24558354
19.  Combining Gene Signatures Improves Prediction of Breast Cancer Survival 
PLoS ONE  2011;6(3):e17845.
Several gene sets for prediction of breast cancer survival have been derived from whole-genome mRNA expression profiles. Here, we develop a statistical framework to explore whether combination of the information from such sets may improve prediction of recurrence and breast cancer specific death in early-stage breast cancers. Microarray data from two clinically similar cohorts of breast cancer patients are used as training (n = 123) and test set (n = 81), respectively. Gene sets from eleven previously published gene signatures are included in the study.
Principal Findings
To investigate the relationship between breast cancer survival and gene expression on a particular gene set, a Cox proportional hazards model is applied using partial likelihood regression with an L2 penalty to avoid overfitting and using cross-validation to determine the penalty weight. The fitted models are applied to an independent test set to obtain a predicted risk for each individual and each gene set. Hierarchical clustering of the test individuals on the basis of the vector of predicted risks results in two clusters with distinct clinical characteristics in terms of the distribution of molecular subtypes, ER, PR status, TP53 mutation status and histological grade category, and associated with significantly different survival probabilities (recurrence: p = 0.005; breast cancer death: p = 0.014). Finally, principal components analysis of the gene signatures is used to derive combined predictors used to fit a new Cox model. This model classifies test individuals into two risk groups with distinct survival characteristics (recurrence: p = 0.003; breast cancer death: p = 0.001). The latter classifier outperforms all the individual gene signatures, as well as Cox models based on traditional clinical parameters and the Adjuvant! Online for survival prediction.
Combining the predictive strength of multiple gene signatures improves prediction of breast cancer survival. The presented methodology is broadly applicable to breast cancer risk assessment using any new identified gene set.
PMCID: PMC3053398  PMID: 21423775
20.  Genomic-enabled prediction with classification algorithms 
Heredity  2014;112(6):616-626.
Pearson's correlation coefficient (ρ) is the most commonly reported metric of the success of prediction in genomic selection (GS). However, in real breeding ρ may not be very useful for assessing the quality of the regression in the tails of the distribution, where individuals are chosen for selection. This research used 14 maize and 16 wheat data sets with different trait–environment combinations. Six different models were evaluated by means of a cross-validation scheme (50 random partitions each, with 90% of the individuals in the training set and 10% in the testing set). The predictive accuracy of these algorithms for selecting individuals belonging to the best α=10, 15, 20, 25, 30, 35, 40% of the distribution was estimated using Cohen's kappa coefficient (κ) and an ad hoc measure, which we call relative efficiency (RE), which indicates the expected genetic gain due to selection when individuals are selected based on GS exclusively. We put special emphasis on the analysis for α=15%, because it is a percentile commonly used in plant breeding programmes (for example, at CIMMYT). We also used ρ as a criterion for overall success. The algorithms used were: Bayesian LASSO (BL), Ridge Regression (RR), Reproducing Kernel Hilbert Spaces (RHKS), Random Forest Regression (RFR), and Support Vector Regression (SVR) with linear (lin) and Gaussian kernels (rbf). The performance of regression methods for selecting the best individuals was compared with that of three supervised classification algorithms: Random Forest Classification (RFC) and Support Vector Classification (SVC) with linear (lin) and Gaussian (rbf) kernels. Classification methods were evaluated using the same cross-validation scheme but with the response vector of the original training sets dichotomised using a given threshold. For α=15%, SVC-lin presented the highest κ coefficients in 13 of the 14 maize data sets, with best values ranging from 0.131 to 0.722 (statistically significant in 9 data sets) and the best RE in the same 13 data sets, with values ranging from 0.393 to 0.948 (statistically significant in 12 data sets). RR produced the best mean for both κ and RE in one data set (0.148 and 0.381, respectively). Regarding the wheat data sets, SVC-lin presented the best κ in 12 of the 16 data sets, with outcomes ranging from 0.280 to 0.580 (statistically significant in 4 data sets) and the best RE in 9 data sets ranging from 0.484 to 0.821 (statistically significant in 5 data sets). SVC-rbf (0.235), RR (0.265) and RHKS (0.422) gave the best κ in one data set each, while RHKS and BL tied for the last one (0.234). Finally, BL presented the best RE in two data sets (0.738 and 0.750), RFR (0.636) and SVC-rbf (0.617) in one and RHKS in the remaining three (0.502, 0.458 and 0.586). The difference between the performance of SVC-lin and that of the rest of the models was not so pronounced at higher percentiles of the distribution. The behaviour of regression and classification algorithms varied markedly when selection was done at different thresholds, that is, κ and RE for each algorithm depended strongly on the selection percentile. Based on the results, we propose classification method as a promising alternative for GS in plant breeding.
PMCID: PMC4023444  PMID: 24424163
genomic selection; maize; wheat; support vector machines
21.  Landmark Risk Prediction of Residual Life for Breast Cancer Survival 
Statistics in medicine  2013;32(20):3459-3471.
The importance of developing personalized risk prediction estimates has become increasingly evident in recent years. In general, patient populations may be heterogenous and represent a mixture of different unknown subtypes of disease. When the source of this heterogeneity and resulting subtypes of disease are unknown, accurate prediction of survival may be difficult. However, in certain disease settings the onset time of an observable short term event may be highly associated with these unknown subtypes of disease and thus may be useful in predicting long term survival. One approach to incorporate short term event information along with baseline markers for the prediction of long term survival is through a landmark Cox model, which assumes a proportional hazards model for the residual life at a given landmark point. In this paper, we use this modeling framework to develop procedures to assess how a patient’s long term survival trajectory may change over time given good short term outcome indications along with prognosis based on baseline markers. We first propose time-varying accuracy measures to quantify the predictive performance of landmark prediction rules for residual life and provide resampling-based procedures to make inference about such accuracy measures. Simulation studies show that the proposed procedures perform well in finite samples. Throughout, we illustrate our proposed procedures using a breast cancer dataset with information on time to metastasis and time to death. In addition to baseline clinical markers available for each patient, a chromosome instability genetic score, denoted by CIN25, is also available for each patient and has been shown to be predictive of survival for various types of cancer. We provide procedures to evaluate the incremental value of CIN25 for the prediction of residual life and examine how the residual life profile changes over time. This allows us to identify an informative landmark point, t0, such that accurate risk predictions of the residual life could be made for patients who survive past t0 without metastasis.
PMCID: PMC3744612  PMID: 23494768
landmark prediction; biomarkers; disease prognosis; predictive accuracy; risk prediction; survival analysis
22.  Gene Expression Classification of Colon Cancer into Molecular Subtypes: Characterization, Validation, and Prognostic Value 
PLoS Medicine  2013;10(5):e1001453.
Colon cancer (CC) pathological staging fails to accurately predict recurrence, and to date, no gene expression signature has proven reliable for prognosis stratification in clinical practice, perhaps because CC is a heterogeneous disease. The aim of this study was to establish a comprehensive molecular classification of CC based on mRNA expression profile analyses.
Methods and Findings
Fresh-frozen primary tumor samples from a large multicenter cohort of 750 patients with stage I to IV CC who underwent surgery between 1987 and 2007 in seven centers were characterized for common DNA alterations, including BRAF, KRAS, and TP53 mutations, CpG island methylator phenotype, mismatch repair status, and chromosomal instability status, and were screened with whole genome and transcriptome arrays. 566 samples fulfilled RNA quality requirements. Unsupervised consensus hierarchical clustering applied to gene expression data from a discovery subset of 443 CC samples identified six molecular subtypes. These subtypes were associated with distinct clinicopathological characteristics, molecular alterations, specific enrichments of supervised gene expression signatures (stem cell phenotype–like, normal-like, serrated CC phenotype–like), and deregulated signaling pathways. Based on their main biological characteristics, we distinguished a deficient mismatch repair subtype, a KRAS mutant subtype, a cancer stem cell subtype, and three chromosomal instability subtypes, including one associated with down-regulated immune pathways, one with up-regulation of the Wnt pathway, and one displaying a normal-like gene expression profile. The classification was validated in the remaining 123 samples plus an independent set of 1,058 CC samples, including eight public datasets. Furthermore, prognosis was analyzed in the subset of stage II–III CC samples. The subtypes C4 and C6, but not the subtypes C1, C2, C3, and C5, were independently associated with shorter relapse-free survival, even after adjusting for age, sex, stage, and the emerging prognostic classifier Oncotype DX Colon Cancer Assay recurrence score (hazard ratio 1.5, 95% CI 1.1–2.1, p = 0.0097). However, a limitation of this study is that information on tumor grade and number of nodes examined was not available.
We describe the first, to our knowledge, robust transcriptome-based classification of CC that improves the current disease stratification based on clinicopathological variables and common DNA markers. The biological relevance of these subtypes is illustrated by significant differences in prognosis. This analysis provides possibilities for improving prognostic models and therapeutic strategies. In conclusion, we report a new classification of CC into six molecular subtypes that arise through distinct biological pathways.
Please see later in the article for the Editors' Summary
Editors' Summary
Cancer of the large bowel (colorectal cancer) is the third most common cancer in men and the second most common cancer in women worldwide. Despite recent advances in the screening, diagnosis, and treatment of colorectal cancer, an estimated 608,000 people die every year from this form of cancer—8% of all cancer deaths. The prognosis and treatment options for colorectal cancer depend on five pathological stages (0–IV), each of which has a different treatment option and five year survival rate, so it is important that the stage is correctly identified. Unfortunately, pathological staging fails to accurately predict recurrence (relapse) in patients undergoing surgery for localized colorectal cancer, which is a concern, as 10%–20% of patients with stage II and 30%–40% of those with stage III colorectal cancer develop recurrence.
Why Was This Study Done?
Previous studies have investigated whether there are any possible gene expression profiles (identified through microarray techniques) that can help predict prognosis of colorectal cancer, but so far, there have been no firm conclusions that can aid clinical practice. In this study, the researchers used genetic information from a French multicenter study to identify a standard, reproducible molecular classification based on gene expression analysis of colorectal cancer. The authors also assessed whether there were any associations between the identified molecular subtypes and clinical and pathological factors, common DNA alterations, and prognosis.
What Did the Researchers Do and Find?
The researchers used genetic information from a cohort of 750 patients with stage I to IV colorectal cancer who underwent surgery between 1987 and 2007 in seven centers in France. The researchers identified relevant clinical and pathological staging information for each patient from the medical records and calculated recurrence-free survival (the time from surgery to the first recurrence) for patients with stage II or III disease. In the genetic analysis, 566 tumor samples were suitable—443 were used in a discovery set, to create the classification, and the remainder were used in a validation set, to test the classification. The researchers also used information from eight public datasets to validate their findings.
Using these methods, the researchers classified the colon cancer samples into six molecular subtypes (based on gene expression data) and, on further analysis and validation, were able to distinguish the main biological characteristics and deregulated pathways associated with each subtype. Importantly, the researchers found that that these six subtypes were associated with distinct clinical and pathological characteristics, molecular alterations, specific gene expression signatures, and deregulated signaling pathways. In the prognostic analysis based on recurrence-free survival, the researchers found that patients whose tumors were classified in one of two clusters (C4 and C6) had poorer recurrence-free survival than the other patients.
What Do These Findings Mean?
These findings suggest that it is possible to classify colorectal cancer into six robust molecular subtypes that might help identify new prognostic subgroups and could provide a basis for developing robust prognostic genetic signatures for stage II and III colorectal cancer and for identifying specific markers for the different subtypes that might be targets for future drug development. However, as this study was retrospective and did not include some known predictors of colorectal cancer prognosis, such as tumor grade and number of nodes examined, the significance and robustness of the prognostic classification requires further confirmation with large prospective patient cohorts.
PMCID: PMC3660251  PMID: 23700391
23.  Prioritizing CD4 Count Monitoring in Response to ART in Resource-Constrained Settings: A Retrospective Application of Prediction-Based Classification 
PLoS Medicine  2012;9(4):e1001207.
Luis Montaner and colleagues retrospectively apply a potential capacity-saving CD4 count prediction tool to a cohort of HIV patients on antiretroviral therapy.
Global programs of anti-HIV treatment depend on sustained laboratory capacity to assess treatment initiation thresholds and treatment response over time. Currently, there is no valid alternative to CD4 count testing for monitoring immunologic responses to treatment, but laboratory cost and capacity limit access to CD4 testing in resource-constrained settings. Thus, methods to prioritize patients for CD4 count testing could improve treatment monitoring by optimizing resource allocation.
Methods and Findings
Using a prospective cohort of HIV-infected patients (n = 1,956) monitored upon antiretroviral therapy initiation in seven clinical sites with distinct geographical and socio-economic settings, we retrospectively apply a novel prediction-based classification (PBC) modeling method. The model uses repeatedly measured biomarkers (white blood cell count and lymphocyte percent) to predict CD4+ T cell outcome through first-stage modeling and subsequent classification based on clinically relevant thresholds (CD4+ T cell count of 200 or 350 cells/µl). The algorithm correctly classified 90% (cross-validation estimate = 91.5%, standard deviation [SD] = 4.5%) of CD4 count measurements <200 cells/µl in the first year of follow-up; if laboratory testing is applied only to patients predicted to be below the 200-cells/µl threshold, we estimate a potential savings of 54.3% (SD = 4.2%) in CD4 testing capacity. A capacity savings of 34% (SD = 3.9%) is predicted using a CD4 threshold of 350 cells/µl. Similar results were obtained over the 3 y of follow-up available (n = 619). Limitations include a need for future economic healthcare outcome analysis, a need for assessment of extensibility beyond the 3-y observation time, and the need to assign a false positive threshold.
Our results support the use of PBC modeling as a triage point at the laboratory, lessening the need for laboratory-based CD4+ T cell count testing; implementation of this tool could help optimize the use of laboratory resources, directing CD4 testing towards higher-risk patients. However, further prospective studies and economic analyses are needed to demonstrate that the PBC model can be effectively applied in clinical settings.
Please see later in the article for the Editors' Summary
Editors' Summary
AIDS has killed nearly 30 million people since 1981, and about 34 million people (most of them living in low- and middle-income countries) are now infected with HIV, the virus that causes AIDS. HIV destroys immune system cells (including CD4 cells, a type of lymphocyte and one of the body's white blood cell types), leaving infected individuals susceptible to other infections. Early in the AIDS epidemic, most HIV-infected people died within ten years of infection. Then, in 1996, antiretroviral therapy (ART) became available, and for people living in affluent countries, HIV/AIDS became a chronic condition. However, ART was expensive, and for people living in developing countries, HIV/AIDS remained a fatal illness. In 2003, HIV was declared a global health emergency, and in 2006, the international community set itself the target of achieving universal access to ART by 2010. By the end of 2010, only 6.6 million of the estimated 15 million people in need of ART in developing countries were receiving ART.
Why Was This Study Done?
One factor that has impeded progress towards universal ART coverage has been the limited availability of trained personnel and laboratory facilities in many developing countries. These resources are needed to determine when individuals should start ART—the World Health Organization currently recommends that people start ART when their CD4 count drops below 350 cells/µl—and to monitor treatment responses over time so that viral resistance to ART is quickly detected. Although a total lymphocyte count can be used as a surrogate measure to decide when to start treatment, repeated CD4 cell counts are the only way to monitor immunologic responses to treatment, a level of monitoring that is rarely sustainable in resource-constrained settings. A method that optimizes resource allocation by prioritizing who gets tested might be one way to improve treatment monitoring. In this study, the researchers applied a new tool for prioritizing laboratory-based CD4 cell count testing in resource-constrained settings to patient data that had been previously collected.
What Did the Researchers Do and Find?
The researchers fitted a mixed-effects statistical model to repeated CD4 count measurements from HIV-infected individuals from seven sites around the world (including some resource-limited sites). They then used model-derived estimates to apply a mathematical tool for predicting—from a CD4 count taken at the start of treatment, and white blood cell counts and lymphocyte percentage measurements taken later—whether CD4 counts would be above 200 cells/µl (the original threshold recommended for ART initiation) and 350 cells/µl (the current recommended threshold) for up to three years after ART initiation. The tool correctly classified 91.5% of the CD4 cell counts that were below 200 cells/µl in the first year of ART. With this threshold, the potential savings in CD4 testing capacity was 54.3%. With a CD4 count threshold of 350 cells/µl, the potential savings in testing capacity was 34%. The results over a three-year follow-up were similar. When applied to six representative HIV-positive individuals, the tool correctly predicted all the CD4 counts above 200 cells/µl, although some individuals who had a predicted CD4 count of less than 200 cells/µl actually had a CD4 count above this threshold. Thus, none of these individuals would have been exposed to an undetected dangerous CD4 count, but the application of the tool would have saved 57% of the CD4 laboratory tests done during the first year of ART.
What Do These Findings Mean?
These findings support the use of this new tool—the prediction-based classification (PBC) algorithm—for predicting a drop in CD4 count below a clinically meaningful threshold in HIV-infected individuals receiving ART. Further studies are now needed to demonstrate the feasibility, clinical effectiveness, and cost-effectiveness of this approach, to find out whether the tool can be used over extended periods of time, and to investigate whether the accuracy of its predictions can be improved by, for example, adding in periodic CD4 testing. Provided these studies confirm its early promise, the researchers suggest that the PBC algorithm could be used as a “triage” tool to direct available laboratory testing capacity to high-priority individuals (those likely to have a dangerously low CD4 count). By optimizing the use of limited laboratory resources in this and other ways, the PBC algorithm could therefore help to maintain and expand ART programs in low- and middle-income countries.
PMCID: PMC3328436  PMID: 22529752
24.  A tumor DNA complex aberration index is an independent predictor of survival in breast and ovarian cancer 
Molecular Oncology  2015;9(1):115-127.
Complex focal chromosomal rearrangements in cancer genomes, also called “firestorms”, can be scored from DNA copy number data. The complex arm-wise aberration index (CAAI) is a score that captures DNA copy number alterations that appear as focal complex events in tumors, and has potential prognostic value in breast cancer. This study aimed to validate this DNA-based prognostic index in breast cancer and test for the first time its potential prognostic value in ovarian cancer. Copy number alteration (CNA) data from 1950 breast carcinomas (METABRIC cohort) and 508 high-grade serous ovarian carcinomas (TCGA dataset) were analyzed. Cases were classified as CAAI positive if at least one complex focal event was scored. Complex alterations were frequently localized on chromosome 8p (n = 159), 17q (n = 176) and 11q (n = 251). CAAI events on 11q were most frequent in estrogen receptor positive (ER+) cases and on 17q in estrogen receptor negative (ER−) cases. We found only a modest correlation between CAAI and the overall rate of genomic instability (GII) and number of breakpoints (r = 0.27 and r = 0.42, p < 0.001). Breast cancer specific survival (BCSS), overall survival (OS) and ovarian cancer progression free survival (PFS) were used as clinical end points in Cox proportional hazard model survival analyses. CAAI positive breast cancers (43%) had higher mortality: hazard ratio (HR) of 1.94 (95%CI, 1.62–2.32) for BCSS, and of 1.49 (95%CI, 1.30–1.71) for OS. Representations of the 70-gene and the 21-gene predictors were compared with CAAI in multivariable models and CAAI was independently significant with a Cox adjusted HR of 1.56 (95%CI, 1.23–1.99) for ER+ and 1.55 (95%CI, 1.11–2.18) for ER− disease. None of the expression-based predictors were prognostic in the ER− subset. We found that a model including CAAI and the two expression-based prognostic signatures outperformed a model including the 21-gene and 70-gene signatures but excluding CAAI. Inclusion of CAAI in the clinical prognostication tool PREDICT significantly improved its performance. CAAI positive ovarian cancers (52%) also had worse prognosis: HRs of 1.3 (95%CI, 1.1–1.7) for PFS and 1.3 (95%CI, 1.1–1.6) for OS. This study validates CAAI as an independent predictor of survival in both ER+ and ER− breast cancer and reveals a significant prognostic value for CAAI in high-grade serous ovarian cancer.
•The complex arm-wise aberration index (CAAI) captures focal complex DNA alterations.•Compared with other indices of genomic instability, CAAI adds unique information.•CAAI is validated as an independent prognostic marker in breast cancer (n = 1950).•Prognostic value of CAAI is independent of the 70- and 21-gene classifiers.•CAAI is a new independent prognostic marker in ovarian cancer.
PMCID: PMC4286124  PMID: 25169931
Breast cancer; Ovarian cancer; Prognostic markers; Biomarker; Genomics; Genomic instability; DNA copy number; BCSS, Breast cancer specific survival; CAAI, Complex arm-wise aberration index; CNA, Copy number alterations; ER, Estrogen receptor; HR, Hazard ratio; HGSOC, High-grade serous ovarian cancer; MIP, Molecular inversion probe; OS, Overall survival; PFS, Progression free survival
25.  Using cross-validation to evaluate predictive accuracy of survival risk classifiers based on high-dimensional data 
Briefings in Bioinformatics  2011;12(3):203-214.
Developments in whole genome biotechnology have stimulated statistical focus on prediction methods. We review here methodology for classifying patients into survival risk groups and for using cross-validation to evaluate such classifications. Measures of discrimination for survival risk models include separation of survival curves, time-dependent ROC curves and Harrell’s concordance index. For high-dimensional data applications, however, computing these measures as re-substitution statistics on the same data used for model development results in highly biased estimates. Most developments in methodology for survival risk modeling with high-dimensional data have utilized separate test data sets for model evaluation. Cross-validation has sometimes been used for optimization of tuning parameters. In many applications, however, the data available are too limited for effective division into training and test sets and consequently authors have often either reported re-substitution statistics or analyzed their data using binary classification methods in order to utilize familiar cross-validation. In this article we have tried to indicate how to utilize cross-validation for the evaluation of survival risk models; specifically how to compute cross-validated estimates of survival distributions for predicted risk groups and how to compute cross-validated time-dependent ROC curves. We have also discussed evaluation of the statistical significance of a survival risk model and evaluation of whether high-dimensional genomic data adds predictive accuracy to a model based on standard covariates alone.
PMCID: PMC3105299  PMID: 21324971
predictive medicine; survival risk classification; cross-validation; gene expression

