Identification of individuals at high risk for lung cancer should be of value to individuals, patients, clinicians, and researchers. Existing prediction models have only modest capabilities to classify persons at risk accurately.
Prospective data from 70 962 control subjects in the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial (PLCO) were used in models for the general population (model 1) and for a subcohort of ever-smokers (N = 38 254) (model 2). Both models included age, socioeconomic status (education), body mass index, family history of lung cancer, chronic obstructive pulmonary disease, recent chest x-ray, smoking status (never, former, or current), pack-years smoked, and smoking duration. Model 2 also included smoking quit-time (time in years since ever-smokers permanently quit smoking). External validation was performed with 44 223 PLCO intervention arm participants who completed a supplemental questionnaire and were subsequently followed. Known available risk factors were included in logistic regression models. Bootstrap optimism-corrected estimates of predictive performance were calculated (internal validation). Nonlinear relationships for age, pack-years smoked, smoking duration, and quit-time were modeled using restricted cubic splines. All reported P values are two-sided.
During follow-up (median 9.2 years) of the control arm subjects, 1040 lung cancers occurred. During follow-up of the external validation sample (median 3.0 years), 213 lung cancers occurred. For models 1 and 2, bootstrap optimism-corrected receiver operator characteristic area under the curves were 0.857 and 0.805, and calibration slopes (model-predicted probabilities vs observed probabilities) were 0.987 and 0.979, respectively. In the external validation sample, models 1 and 2 had area under the curves of 0.841 and 0.784, respectively. These models had high discrimination in women, men, whites, and nonwhites.
The PLCO lung cancer risk models demonstrate high discrimination and calibration.
Because existing risk prediction models for lung cancer were developed in white populations, they may not be appropriate for predicting risk among African-Americans. Therefore, a need exists to construct and validate a risk prediction model for lung cancer that is specific to African-Americans. We analyzed data from 491 African-Americans with lung cancer and 497 matched African-American controls to identify specific risks and incorporate them into a multivariable risk model for lung cancer and estimate the 5-year absolute risk of lung cancer. We performed internal and external validations of the risk model using data on additional cases and controls from the same ongoing multiracial/ethnic lung cancer case-control study from which the model-building data were obtained as well as data from two different lung cancer studies in metropolitan Detroit, respectively. We also compared our African-American model with our previously developed risk prediction model for whites. The final risk model included smoking-related variables [smoking status, pack-years smoked, age at smoking cessation (former smokers), and number of years since smoking cessation (former smokers)], self- reported physician diagnoses of chronic obstructive pulmonary disease or hay fever, and exposures to asbestos or wood dusts. Our risk prediction model for African-Americans exhibited good discrimination [75% (95% confidence interval, 0.67−0.82)] for our internal data and moderate discrimination [63% (95% confidence interval, 0.57−0.69)] for the external data group, which is an improvement over the Spitz model for white subjects. Existing lung cancer prediction models may not be appropriate for predicting risk for African-Americans because (a) they were developed using white populations, (b) level of risk is different for risk factors that African-American share with whites, and (c) unique group-specific risk factors exist for African-Americans. This study developed and validated a risk prediction model for lung cancer that is specific to African-Americans and thus more precise in predicting their risks. These findings highlight the importance of conducting further ethnic-specific analyses of disease risk.
An independent cohort of acute lung injury (ALI) patients was used to evaluate the external validity of a simple prediction model for short-term mortality previously developed using data from ARDS Network (ARDSNet) trials.
Design, Setting, and Patients
Data for external validation were obtained from a prospective cohort study of ALI patients from 13 ICUs at four teaching hospitals in Baltimore, Maryland.
Measurements and Main Results
Of the 508 non-trauma, ALI patients eligible for this analysis, 234 (46%) died in-hospital. Discrimination of the ARDSNet prediction model for inhospital mortality, evaluated by the area under the receiver operator characteristics curves (AUC), was 0.67 for our external validation dataset versus 0.70 and 0.68 using APACHE II and the ARDSNet validation dataset, respectively. In evaluating calibration of the model, predicted versus observed in-hospital mortality for the external validation dataset was similar for both low risk (ARDSNet model score = 0) and high risk (score = 3 or 4+) patient strata. However, for intermediate risk (score = 1 or 2) patients, observed in-hospital mortality was substantially higher than predicted mortality (25.3% vs. 16.5% and 40.6% vs. 31.0% for score = 1 and 2, respectively). Sensitivity analyses limiting our external validation data set to only those patients meeting the ARDSNet trial eligibility criteria and to those who received mechanical ventilation in compliance with the ARDSNet ventilation protocol, did not substantially change the model’s discrimination or improve its calibration.
Evaluation of the ARDSNet prediction model using an external ALI cohort demonstrated similar discrimination of the model as was observed with the ARDSNet validation dataset. However, there were substantial differences in observed versus predicted mortality among intermediate risk ALI patients. The ARDSNet model provided reasonable, but imprecise, estimates of predicted mortality when applied to our external validation cohort of ALI patients.
respiratory distress syndrome; adult; statistical model; mortality determinants; prognosis; health status indicators; intensive care units
Prostate specific antigen (PSA) is widely used as a diagnostic biomarker for prostate cancer (PC). However, due to its low predictive performance, many patients without PC suffer from the harms of unnecessary prostate needle biopsies. The present study aims to evaluate the reproducibility and performance of a genetic risk prediction model in Japanese and estimate its utility as a diagnostic biomarker in a clinical scenario. We created a logistic regression model incorporating 16 SNPs that were significantly associated with PC in a genome-wide association study of Japanese population using 689 cases and 749 male controls. The model was validated by two independent sets of Japanese samples comprising 3,294 cases and 6,281 male controls. The areas under curve (AUC) of the model were 0.679, 0.655, and 0.661 for the samples used to create the model and those used for validation. The AUCs were not significantly altered in samples with PSA 1–10 ng/ml. 24.2% and 9.7% of the patients had odds ratio <0.5 (low risk) or >2 (high risk) in the model. Assuming the overall positive rate of prostate needle biopsies to be 20%, the positive biopsy rates were 10.7% and 42.4% for the low and high genetic risk groups respectively. Our genetic risk prediction model for PC was highly reproducible, and its predictive performance was not influenced by PSA. The model could have a potential to affect clinical decision when it is applied to patients with gray-zone PSA, which should be confirmed in future clinical studies.
Rationale: Accurate, early identification of patients at risk for developing acute lung injury (ALI) provides the opportunity to test and implement secondary prevention strategies.
Objectives: To determine the frequency and outcome of ALI development in patients at risk and validate a lung injury prediction score (LIPS).
Methods: In this prospective multicenter observational cohort study, predisposing conditions and risk modifiers predictive of ALI development were identified from routine clinical data available during initial evaluation. The discrimination of the model was assessed with area under receiver operating curve (AUC). The risk of death from ALI was determined after adjustment for severity of illness and predisposing conditions.
Measurements and Main Results: Twenty-two hospitals enrolled 5,584 patients at risk. ALI developed a median of 2 (interquartile range 1–4) days after initial evaluation in 377 (6.8%; 148 ALI-only, 229 adult respiratory distress syndrome) patients. The frequency of ALI varied according to predisposing conditions (from 3% in pancreatitis to 26% after smoke inhalation). LIPS discriminated patients who developed ALI from those who did not with an AUC of 0.80 (95% confidence interval, 0.78–0.82). When adjusted for severity of illness and predisposing conditions, development of ALI increased the risk of in-hospital death (odds ratio, 4.1; 95% confidence interval, 2.9–5.7).
Conclusions: ALI occurrence varies according to predisposing conditions and carries an independently poor prognosis. Using routinely available clinical data, LIPS identifies patients at high risk for ALI early in the course of their illness. This model will alert clinicians about the risk of ALI and facilitate testing and implementation of ALI prevention strategies.
Clinical trial registered with www.clinicaltrials.gov (NCT00889772).
respiratory distress syndrome, adult; prevention; prediction model; acute respiratory failure
We developed a web-based, prognostic tool for extremity and trunk wall soft tissue sarcoma to predict 10-year sarcoma-specific survival. External validation was performed.
Patients referred during 1987–2002 to Helsinki University Central Hospital are included. External validation was obtained from the Lund University Hospital register. Cox proportional hazards models were fitted with the Helsinki data. The previously described model (SIN) includes size, necrosis, and vascular invasion. The extended model (SAM) includes the SIN factors and in addition depth, location, grade, and size on a continuous scale. Models were statistically compared according to accuracy (area under the ROC curve=AUC) of 10-year sarcoma-specific survival prediction.
The AUC of the SAM model in10-year survival prediction in the Helsinki patient series was 0.81 as compared with 0.74 for the SIN model (P=0.0007). The corresponding AUCs in the external validation series were 0.77 for the SAM model and 0.73 for the SIN model (P=0.03). A web-based calculator for the SAM model is available at http://www.prognomics.org/sam.
Addition of grade, depth, and location as well as tumour size on a continuous scale significantly improved the accuracy of the prognostic model when compared with a model that includes only size, necrosis, and vascular invasion.
soft tissue sarcoma; prognosis; web-based; chemotherapy
Sensorineural hearing loss is the most common sequela in survivors of bacterial meningitis (BM). In the past we developed a validated prediction model to identify children at risk for post-meningitis hearing loss. It is known that host genetic variations, besides clinical factors, contribute to severity and outcome of BM. In this study it was determined whether host genetic risk factors improve the predictive abilities of an existing model regarding hearing loss after childhood BM.
Four hundred and seventy-one Dutch Caucasian childhood BM were genotyped for 11 single nucleotide polymorphisms (SNPs) in seven different genes involved in pathogen recognition. Genetic data were added to the original clinical prediction model and performance of new models was compared to the original model by likelihood ratio tests and the area under the curve (AUC) of the receiver operating characteristic curves.
Addition of TLR9-1237 SNPs and the combination of TLR2 + 2477 and TLR4 + 896 SNPs improved the clinical prediction model, but not significantly (increase of AUC’s from 0.856 to 0.861 and from 0.856 to 0.875 (p = 0.570 and 0.335, respectively). Other SNPs analysed were not linked to hearing loss.
Although addition of genetic risk factors did not significantly improve the clinical prediction model for post-meningitis hearing loss, AUC’s of the pre-existing model remain high after addition of genetic factors. Future studies should evaluate whether more combinations of SNPs in larger cohorts has an additional value to the existing prediction model for post meningitis hearing loss.
Genetics; SNP; Risk; Prediction; Bacterial meningitis; Hearing loss; Child
A combination of biomarkers in a multivariate model may predict disease with greater accuracy than a single biomarker employed alone. We developed a non-linear method of multivariate analysis, weighted digital analysis (WDA), and evaluated its ability to predict lung cancer employing volatile biomarkers in the breath.
WDA generates a discriminant function to predict membership in disease vs no disease groups by determining weight, a cutoff value, and a sign for each predictor variable employed in the model. The weight of each predictor variable was the area under the curve (AUC) of the receiver operating characteristic (ROC) curve minus a fixed offset of 0.55, where the AUC was obtained by employing that predictor variable alone, as the sole marker of disease. The sign (±) was used to invert the predictor variable if a lower value indicated a higher probability of disease. When employed to predict the presence of a disease in a particular patient, the discriminant function was determined as the sum of the weights of all predictor variables that exceeded their cutoff values. The algorithm that generates the discriminant function is deterministic because parameters are calculated from each individual predictor variable without any optimization or adjustment. We employed WDA to re-evaluate data from a recent study of breath biomarkers of lung cancer, comprising the volatile organic compounds (VOCs) in the alveolar breath of 193 subjects with primary lung cancer and 211 controls with a negative chest CT.
The WDA discriminant function accurately identified patients with lung cancer in a model employing 30 breath VOCs (ROC curve AUC = 0.90; sensitivity = 84.5%, specificity = 81.0%). These results were superior to multi-linear regression analysis of the same data set (AUC= 0.74, sensitivity = 68.4, specificity = 73.5%). WDA test accuracy did not vary appreciably with TNM (tumor, node, metastasis) stage of disease, and results were not affected by tobacco smoking (ROC curve AUC =0.92 in current smokers, 0.90 in former smokers). WDA was a robust predictor of lung cancer: random removal of 1/3 of the VOCs did not reduce the AUC of the ROC curve by >10% (99.7% CI).
A test employing WDA of breath VOCs predicted lung cancer with accuracy similar to chest computed tomography. The algorithm identified dependencies that were not apparent with traditional linear methods. WDA appears to provide a useful new technique for non-linear multivariate analysis of data.
To estimate the likely number and predictive strength of cancer-associated single nucleotide polymorphisms (SNPs) that are yet to be discovered for seven common cancers.
From the statistical power of published genome-wide association studies, we estimated the number of undetected susceptibility loci and the distribution of effect sizes for all cancers. Assuming a log-normal model for risks and multiplicative relative risks for SNPs, family history (FH), and known risk factors, we estimated the area under the receiver operating characteristic curve (AUC) and the proportion of patients with risks above risk thresholds for screening. From additional prevalence data, we estimated the positive predictive value and the ratio of non–patient cases to patient cases (false-positive ratio) for various risk thresholds.
Age-specific discriminatory accuracy (AUC) for models including FH and foreseeable SNPs ranged from 0.575 for ovarian cancer to 0.694 for prostate cancer. The proportions of patients in the highest decile of population risk ranged from 16.2% for ovarian cancer to 29.4% for prostate cancer. The corresponding false-positive ratios were 241 for colorectal cancer, 610 for ovarian cancer, and 138 or 280 for breast cancer in women age 50 to 54 or 40 to 44 years, respectively.
Foreseeable common SNP discoveries may not permit identification of small subsets of patients that contain most cancers. Usefulness of screening could be diminished by many false positives. Additional strong risk factors are needed to improve risk discrimination.
International Collaborative Effort on Chronic Obstructive Lung Disease: Exacerbation Risk Index Cohorts (ICE COLD ERIC) is a prospective cohort study with chronic obstructive pulmonary disease (COPD) patients from Switzerland and The Netherlands designed to develop and validate practical COPD risk indices that predict the clinical course of COPD patients in primary care. This paper describes the characteristics of the cohorts at baseline.
Material and methods
Standardized assessments included lung function, patient history, self-administered questionnaires, exercise capacity, and a venous blood sample for analysis of biomarkers and genetics.
A total of 260 Dutch and 151 Swiss patients were included. Median age was 66 years, 57% were male, 38% were current smokers, 55% were former smokers, and 76% had at least one and 40% had two or more comorbidities with cardiovascular disease being the most prevalent one. The use of any pulmonary and cardiovascular drugs was 84% and 66%, respectively. Although lung function results (median forced expiratory volume in 1 second [FEV1] was 59% of predicted) were similar across the two cohorts, Swiss patients reported better COPD-specific health-related quality of life (Chronic Respiratory Questionnaire) and had higher exercise capacity.
COPD patients in the ICE COLD ERIC study represent a wide range of disease severities and the prevalence of multimorbidity is high. The rich variation in these primary care cohorts offers good opportunities to learn more about the clinical course of COPD.
COPD; exacerbation; health-related quality of life; prediction; prognosis
Affordable early screening in subjects with high risk of lung cancer has great potential to improve survival from this deadly disease. We measured gene expression from lung tissue and peripheral whole blood (PWB) from adenocarcinoma cases and controls to identify dysregulated lung cancer genes that could be tested in blood to improve identification of at-risk patients in the future. Genome-wide mRNA expression analysis was conducted in 153 subjects (73 adenocarcinoma cases, 80 controls) from the Environment And Genetics in Lung cancer Etiology (EAGLE) study using PWB and paired snap-frozen tumor and non-involved lung tissue samples. Analyses were conducted using unpaired t-tests, linear mixed effects and ANOVA models. The area under the receiver operating characteristic curve (AUC) was computed to assess the predictive accuracy of the identified biomarkers. We identified 50 dysregulated genes in stage I adenocarcinoma versus control PWB samples (False Discovery Rate ≤0.1, fold change ≥1.5 or ≤0.66). Among them, eight (TGFBR3, RUNX3, TRGC2, TRGV9, TARP, ACP1, VCAN, and TSTA3) differentiated paired tumor versus non-involved lung tissue samples in stage I cases, suggesting a similar pattern of lung cancer-related changes in PWB and lung tissue. These results were confirmed in two independent gene expression analyses in a blood-based case-control study (n=212) and a tumor-non tumor paired tissue study (n=54). The eight genes discriminated patients with lung cancer from healthy controls with high accuracy (AUC=0.81, 95% CI=0.74–0.87). Our finding suggests the use of gene expression from PWB for the identification of early detection markers of lung cancer in the future.
microarray gene expression; peripheral blood; lung cancer; stage I
Smoking is a prominent risk factor for lung cancer. However, it is not an established prognostic factor for lung cancer in clinics. To date, no gene test is available for diagnostic screening of lung cancer risk or prognostication of clinical outcome in smokers. This study sought to identify a smoking associated gene signature in order to provide a more precise diagnosis and prognosis of lung cancer in smokers.
Methods and materials
An implication network based methodology was used to identify biomarkers by modeling crosstalk with major lung cancer signaling pathways. Specifically, the methodology contains the following steps: 1) identifying genes significantly associated with lung cancer survival; 2) selecting candidate genes which are differentially expressed in smokers versus non-smokers from the survival genes identified in Step 1; 3) from these candidate genes, constructing gene coexpression networks based on prediction logic for the smoker group and the non-smoker group, respectively; 4) identifying smoking-mediated differential components, i.e., the unique gene coexpression patterns specific to each group; and 5) from the differential components, identifying genes directly co-expressed with major lung cancer signaling hallmarks.
A smoking-associated 6-gene signature was identified for prognosis of lung cancer from a training cohort (n=256). The 6-gene signature could separate lung cancer patients into two risk groups with distinct post-operative survival (log-rank P < 0.04, Kaplan-Meier analyses) in three independent cohorts (n=427). The expression-defined prognostic prediction is strongly related to smoking association and smoking cessation (P < 0.02; Pearson’s Chi-squared tests). The 6-gene signature is an accurate prognostic factor (hazard ratio = 1.89, 95% CI: [1.04, 3.43]) compared to common clinical covariates in multivariate Cox analysis. The 6-gene signature also provides an accurate diagnosis of lung cancer with an overall accuracy of 73% in a cohort of smokers (n=164). The coexpression patterns derived from the implication networks were validated with interactions reported in the literature retrieved with STRING8, Ingenuity Pathway Analysis, and Pathway Studio.
The pathway-based approach identified a smoking-associated 6-gene signature that predicts lung cancer risk and survival. This gene signature has potential clinical implications in the diagnosis and prognosis of lung cancer in smokers.
implication networks based on prediction logic; gene coexpression networks based on formal logic; smoking; gene signature; lung cancer diagnosis and prognosis; signaling pathways
Lung cancer is the leading cause of cancer death in the United States, and the majority of diagnoses are made in former smokers. While avoidance of tobacco abuse and smoking cessation clearly will have the greatest impact on lung cancer development, effective chemoprevention could prove to be more effective than treatment of established disease. Chemoprevention is the use of dietary or pharmaceutical agents to reverse or inhibit the carcinogenic process and has been successfully applied to common malignancies other than lung. Despite previous studies in lung cancer chemoprevention failing to identify effective agents, our ability to determine higher risk populations and the understanding of lung tumor and pre-malignant biology continues to advance. Additional biomarkers of risk continue to be investigated and validated. The World Health Organization/International Association for the Study of Lung Cancer classification for lung cancer now recognizes distinct histologic lesions that can be reproducibly graded as precursors of non–small cell lung cancer. For example, carcinogenesis in the bronchial epithelium starts with normal epithelium and progresses through hyperplasia, metaplasia, dysplasia, and carcinoma in situ to invasive squamous cell cancer. Similar precursor lesions exist for adenocarcinoma, and these pre-malignant lesions are targeted by chemopreventive agents in current and future trials. At this time, chemopreventive agents can only be recommended as part of well-designed clinical trials, and multiple trials are currently in progress and additional trials are in the planning stages. This review will discuss the principles of chemoprevention, summarize the completed trials, and discuss ongoing and potential future trials with a focus on targeted pathways.
lung cancer; chemoprevention; premalignancy
Lung cancer is a complex polygenic disease. Although recent genome-wide association (GWA) studies have identified multiple susceptibility loci for lung cancer, most of these variants have not been validated in a Chinese population. In this study, we investigated whether a genetic risk score combining multiple.
Five single-nucleotide polymorphisms (SNPs) identified in previous GWA or large cohort studies were genotyped in 5068 Chinese case–control subjects. The genetic risk score (GRS) based on these SNPs was estimated by two approaches: a simple risk alleles count (cGRS) and a weighted (wGRS) method. The area under the receiver operating characteristic (ROC) curve (AUC) in combination with the bootstrap resampling method was used to assess the predictive performance of the genetic risk score for lung cancer.
Four independent SNPs (rs2736100, rs402710, rs4488809 and rs4083914), were found to be associated with a risk of lung cancer. The wGRS based on these four SNPs was a better predictor than cGRS. Using a liability threshold model, we estimated that these four SNPs accounted for only 4.02% of genetic variance in lung cancer. Smoking history contributed significantly to lung cancer (P < 0.001) risk [AUC = 0.619 (0.603-0.634)], and incorporated with wGRS gave an AUC value of 0.639 (0.621-0.652) after adjustment for over-fitting. This model shows promise for assessing lung cancer risk in a Chinese population.
Our results indicate that although genetic variants related to lung cancer only added moderate discriminatory accuracy, it still improved the predictive ability of the assessment model in Chinese population.
Chinese; Cumulative risk; Genetic risk score; Lung cancer; Risk assessment
Gene promoter hypermethylation in sputum is a promising biomarker for predicting lung cancer. Identifying factors that predispose smokers to methylation of multiple gene promoters in the lung could impact strategies for early detection and chemoprevention. This study evaluated the hypothesis that double-strand break repair capacity and sequence variation in genes in this pathway are associated with a high methylation index in a cohort of current and former cancer-free smokers. A 50% reduction in the mean level of double-strand break repair capacity was seen in lymphocytes from smokers with a high methylation index, defined as ≥ 3 of 8 genes methylated in sputum, compared to smokers with no genes methylated. The classification accuracy for predicting risk for methylation was 88%. Single nucleotide polymorphisms within the MRE11A, CHEK2, XRCC3, DNA-Pkc, and NBN DNA repair genes were highly associated with the methylation index. A 14.5-fold increased odds for high methylation was seen for persons with ≥ 7 risk alleles of these genes. Promoter activity of the MRE11A gene that plays a critical role in recognition of DNA damage and activation of ATM was reduced in persons with the risk allele. Collectively, ours is the first population-based study to identify double-strand break DNA repair capacity and specific genes within this pathway as critical determinants for gene methylation in sputum, that is, in turn, associated with elevated risk for lung cancer.
promoter methylation; DNA double strand break; single nucleotide polymorphism; DNA repair capacity; association study
External validation of existing lung cancer risk prediction models is limited. Using such models in clinical practice to guide the referral of patients for computed tomography (CT) screening for lung cancer depends on external validation and evidence of predicted clinical benefit.
To evaluate the discrimination of the Liverpool Lung Project (LLP) risk model and demonstrate its predicted benefit for stratifying patients for CT screening by using data from 3 independent studies from Europe and North America.
Case–control and prospective cohort study.
Europe and North America.
Participants in the European Early Lung Cancer (EUELC) and Harvard case–control studies and the LLP population-based prospective cohort (LLPC) study.
5-year absolute risks for lung cancer predicted by the LLP model.
The LLP risk model had good discrimination in both the Harvard (area under the receiver-operating characteristic curve [AUC], 0.76 [95% CI, 0.75 to 0.78]) and the LLPC (AUC, 0.82 [CI, 0.80 to 0.85]) studies and modest discrimination in the EUELC (AUC, 0.67 [CI, 0.64 to 0.69]) study. The decision utility analysis, which incorporates the harms and benefit of using a risk model to make clinical decisions, indicates that the LLP risk model performed better than smoking duration or family history alone in stratifying high-risk patients for lung cancer CT screening.
The model cannot assess whether including other risk factors, such as lung function or genetic markers, would improve accuracy. Lack of information on asbestos exposure in the LLPC limited the ability to validate the complete LLP risk model.
Validation of the LLP risk model in 3 independent external data sets demonstrated good discrimination and evidence of predicted benefits for stratifying patients for lung cancer CT screening. Further studies are needed to prospectively evaluate model performance and evaluate the optimal population risk thresholds for initiating lung cancer screening.
Primary Funding Source
Roy Castle Lung Cancer Foundation.
Mammographic percent density (PD) is a strong risk factor for breast cancer, but there has been relatively little systematic evaluation of other features in mammographic images that might additionally predict breast cancer risk. We evaluated the association of a large number of image texture features with risk of breast cancer using a clinic-based case-control study of digitized film mammograms, all with screening mammograms prior to breast cancer diagnosis. The sample was split into training (123 cases, 258 controls) and validation (123 cases, 264 controls) datasets. Age and body mass index (BMI)-adjusted Odds Ratios (ORs) per standard deviation change in the feature, 95% confidence intervals, and the area under the receiver operator characteristic curve (AUC) were obtained using logistic regression. A bootstrap approach was used to identify the strongest features in the training dataset, and results for features that validated in the second half of the sample were reported using the full dataset. The mean age at mammography was 64.0 ± 10.2 years, and the mean time from mammography to breast cancer was 3.7 ± 1.0 (range 2.0-5.9 years). PD was associated with breast cancer risk (OR=1.49; 1.25-1.78). The strongest features that validated from each of several classes (Markovian, run-length, Laws, wavelet and Fourier) showed similar ORs as PD and predicted breast cancer at a similar magnitude (AUC=0.58-0.60) as PD (AUC=0.58). All of these features were automatically calculated (unlike PD), and measure texture at a coarse scale. These features were moderately correlated with PD (r = 0.39-0.64), and after adjustment for PD, each of the features attenuated only slightly and retained statistical significance. However, simultaneous inclusion of these features in a model with PD did not significantly improve the ability to predict breast cancer.
mammographic density; computerized image analysis; breast cancer
It has been recognized that patients with non-small cell lung cancer who are lifelong never-smokers constitute a distinct clinical entity. The aim of this study was to assess clinical risk factors for survival among never-smokers with non-small cell lung cancer.
All consecutive non-small cell lung cancer patients diagnosed (n = 285) between May 2005 and May 2009 were included. The clinical characteristics of never-smokers and ever-smokers (former and current) were compared using chi-squared or Student's t tests. Survival curves were calculated using the Kaplan-Meier method, and log-rank tests were used for survival comparisons. A Cox proportional hazards regression analysis was evaluated by adjusting for age (continuous variable), gender (female vs. male), smoking status (never- vs. ever-smoker), the Karnofsky Performance Status Scale (continuous variable), histological type (adenocarcinoma vs. non-adenocarcinoma), AJCC staging (early vs. advanced staging), and treatment (chemotherapy and/or radiotherapy vs. the best treatment support).
Of the 285 non-small cell lung cancer patients, 56 patients were never-smokers. Univariate analyses indicated that the never-smoker patients were more likely to be female (68% vs. 32%) and have adenocarcinoma (70% vs. 51%). Overall median survival was 15.7 months (95% CI: 13.2 to 18.2). The never-smoker patients had a better survival rate than their counterpart, the ever-smokers. Never-smoker status, higher Karnofsky Performance Status, early staging, and treatment were independent and favorable prognostic factors for survival after adjusting for age, gender, and adenocarcinoma in multivariate analysis.
Epidemiological differences exist between never- and ever-smokers with lung cancer. Overall survival among never-smokers was found to be higher and independent of gender and histological type.
Lung neoplasm; Non-small cell lung cancer; Adenocarcinoma; Never-smoker; Smoking
A physiologically based pharmacokinetic (PBPK) model was developed that provides a comprehensive description of the kinetics of trichloroethylene (TCE) and its metabolites, trichloroethanol (TCOH), trichloroacetic acid (TCA), and dichloroacetic acid (DCA), in the mouse, rat, and human for both oral and inhalation exposure. The model includes descriptions of the three principal target tissues for cancer identified in animal bioassays: liver, lung, and kidney. Cancer dose metrics provided in the model include the area under the concentration curve (AUC) for TCA and DCA in the plasma, the peak concentration and AUC for chloral in the tracheobronchial region of the lung, and the production of a thioacetylating intermediate from dichlorovinylcysteine in the kidney. Additional dose metrics provided for noncancer risk assessment include the peak concentrations and AUCs for TCE and TCOH in the blood, as well as the total metabolism of TCE divided by the body weight. Sensitivity and uncertainty analyses were performed on the model to evaluate its suitability for use in a pharmacokinetic risk assessment for TCE. Model predictions of TCE, TCA, DCA, and TCOH concentrations in rodents and humans are in good agreement with a variety of experimental data, suggesting that the model should provide a useful basis for evaluating cross-species differences in pharmacokinetics for these chemicals. In the case of the lung and kidney target tissues, however, only limited data are available for establishing cross-species pharmacokinetics. As a result, PBPK model calculations of target tissue dose for lung and kidney should be used with caution.
Lung cancer in never-smokers ranks as the seventh most common cause of cancer death worldwide, and the incidence of lung cancer in non-smoking Korean women appears to be steadily increasing. To identify the effect of genetic polymorphisms on lung cancer risk in non-smoking Korean women, we conducted a genome-wide association study of Korean female non-smokers with lung cancer. We analyzed 440,794 genotype data of 285 cases and 1,455 controls, and nineteen SNPs were associated with lung cancer development (P < 0.001). For external validation, nineteen SNPs were replicated in another sample set composed of 293 cases and 495 controls, and only rs10187911 on 2p16.3 was significantly associated with lung cancer development (dominant model, OR of TG or GG, 1.58, P = 0.025). We confirmed this SNP again in another replication set composed of 546 cases and 744 controls (recessive model, OR of GG, 1.32, P = 0.027). OR and P value in combined set were 1.37 and < 0.001 in additive model, 1.51 and < 0.001 in dominant model, and 1.54 and < 0.001 in recessive model. The effect of this SNP was found to be consistent only in adenocarcinoma patients (1.36 and < 0.001 in additive model, 1.49 and < 0.001 in dominant model, and 1.54 and < 0.001 in recessive model). Furthermore, after imputation with HapMap data, we found regional significance near rs10187911, and five SNPs showed P value less than that of rs10187911 (rs12478012, rs4377361, rs13005521, rs12475464, and rs7564130). Therefore, we concluded that a region on chromosome 2 is significantly associated with lung cancer risk in Korean non-smoking women.
Lung Neoplasms; Genome-Wide Association Study; Non-Smoking Women
Identifying people at higher risk of having squamous dysplasia, the precursor lesion for esophageal squamous cell carcinoma (ESCC), would allow targeted endoscopic screening.
We used multivariate logistic regression models to predict ESCC and dysplasia as outcomes. The ESCC model was based on data from the Golestan Case-Control Study (total n=871; cases=300), and the dysplasia model was based on data from a cohort of subjects from a GI clinic in Northeast Iran (total n=724; cases=26). In each of these analyses, we fit a model including all risk factors known in this region to be associated with ESCC. Individual risks were calculated using the linear combination of estimated regression coefficients and individual-specific values for covariates. We used cross-validation to determine the area under the curve (AUC) and to find the optimal cut points for each of the models.
The model had an area under the curve of 0.77 (95% CI: 0.74–0.80) to predict ESCC with 74% sensitivity and 70.4% specificity for the optimum cut point. The area under the curve was 0.71 (95% CI: 0.64–0.79) for dysplasia diagnosis, and the classification table optimized at 61.5% sensitivity and 69.5% specificity. In this population, the positive and negative predictive values for diagnosis of dysplasia were 6.8% and 97.8%, respectively.
Our models were able to discriminate between ESCC cases and controls in about 77%, and between individuals with and without squamous dysplasia in about 70% of the cases. Using risk factors to predict individual risk of ESCC or squamous dysplasia still has limited application in clinical practice, but such models may be suitable for selecting high risk individuals in research studies, or increasing the pretest probability for other screening strategies.
Occupational medicine physicians are frequently asked to establish cancer causation in patients with both workplace and non-workplace exposures. This is especially difficult in cases involving beryllium for which the data on human carcinogenicity are limited and controversial. In this report we present the case of a 73-year-old former technician at a government research facility who was recently diagnosed with lung cancer. The patient is a former smoker who has worked with both beryllium and asbestos. He was referred to the University of California, San Francisco, Occupational and Environmental Medicine Clinic at San Francisco General Hospital for an evaluation of whether past workplace exposures may have contributed to his current disease. The goal of this paper is to provide an example of the use of data-based risk estimates to determine causation in patients with multiple exposures. To do this, we review the current knowledge of lung cancer risks in former smokers and asbestos workers, and evaluate the controversies surrounding the epidemiologic data linking beryllium and cancer. Based on this information, we estimated that the patient's risk of lung cancer from asbestos was less than his risk from tobacco smoke, whereas his risk from beryllium was approximately equal to his risk from smoking. Based on these estimates, the patient's workplace was considered a probable contributing factor to his development of lung cancer.
The APACHE score for critically ill patients has provided a method of predicting outcome using major physiologic variables. We hypothesized that a physiology score for stroke patients (Acute Physiology of Stroke Score (APSS)) when added to a validated clinical prediction model would improve outcome prediction.
The APSS was developed and validated using multivariable logistic regression. It was added to a previously validated clinical model to assess for increased AUC in predicting 3 month outcome.
The bootstrap validated bias-corrected AUC for just the APSS predicting alive/dead at discharge was 0.753. The clinical model AUC ranged from 0.77–0.88 and the addition of the APSS resulted in AUCs of 0.77–0.89.
These data suggest that the APSS is related to three month clinical outcome in ischemic stroke patients. However, the APSS adds no clinically relevant additional predictive value when added to our previously validated clinical prediction model.
Stroke; prediction; prognosis; outcome
To access the predictive value of the European Randomized Screening of Prostate Cancer Risk Calculator (ERSPC-RC) and the Prostate Cancer Prevention Trial Risk Calculator (PCPT-RC) in the Korean population.
Materials and Methods
We retrospectively analyzed the data of 517 men who underwent transrectal ultrasound guided prostate biopsy between January 2008 and November 2010. Simple and multiple logistic regression analysis were performed to compare the result of prostate biopsy. Area under the receiver operating characteristics curves (AUC-ROC) and calibration plots were prepared for further analysis to compare the risk calculators and other clinical variables.
Prostate cancer was diagnosed in 125 (24.1%) men. For prostate cancer prediction, the area under curve (AUC) of the ERSPC-RC was 77.4%. This result was significantly greater than the AUCs of the PCPT-RC and the prostate-specific antigen (PSA) (64.5% and 64.1%, respectively, p<0.01), but not significantly different from the AUC of the PSA density (PSAD) (76.1%, p=0.540). When the results of the calibration plots were compared, the ERSPC-RC plot was more constant than that of PSAD.
The ERSPC-RC was better than PCPT-RC and PSA in predicting prostate cancer risk in the present study. However, the difference in performance between the ERSPC-RC and PSAD was not significant. Therefore, the Western based prostate cancer risk calculators are not useful for urologists in predicting prostate cancer in the Korean population.
Korean; prostate cancer; biopsy; nomogram; validation study
The primary hypothesis to be tested in this study was that the diagnostic performance (as assessed by the area under the receiver operator characteristic curve, AUC) of a multianalyte panel to correctly identify women with ovarian cancer was significantly greater than that for CA-125 alone.
A retrospective, case–control study (phase II biomarker trial) was conducted that involved 362 plasma samples obtained from women with ovarian cancer (n = 150) and healthy controls (n = 212). A multivariate classification model was developed that incorporated five biomarkers of ovarian cancer, CA-125; C-reactive protein (CRP); serum amyloid A (SAA); interleukin 6 (IL-6); and interleukin 8 (IL-8) from a modelling cohort (n = 179). The performance of the model was evaluated using an independent validation cohort (n = 183) and compared with of CA-125 alone.
The AUC for the biomarker panel was significantly greater than the AUC for CA-125 alone for a validation cohort (p < 0.01) and an early stage disease cohort (i.e. Stages I and II; p < 0.01). At a threshold of 0.3, the sensitivity and specificity of the multianalyte panel were 94.1 and 91.3%, respectively, for the validation cohort and 92.3 and 91.3%, respectively for an early stage disease cohort.
The use of a panel of plasma biomarkers for the identification of women with ovarian cancer delivers a significant increase in diagnostic performance when compared to the performance of CA-125 alone.
Ovarian cancer; Diagnostic; Multivariate classification