“Discovery” research about molecular markers for diagnosis, prognosis, or prediction of response to therapy has frequently produced results that were not reproducible in subsequent studies. What are the reasons, and can observational cohorts be cultivated to provide strong and reliable answers to those questions?
Selected examples are used to illustrate: 1) What features of research design provide strength and reliability in observational studies about markers of diagnosis, prognosis, and response to therapy? 2) How can those design features be cultivated in existing observational cohorts, for example within RCTs, other existing observational research studies, or practice settings like HMOs?
Examples include a study of RNA expression profiles of tumor tissue to predict prognosis of breast cancer; a study of serum proteomics profiles to diagnose ovarian cancer; and a study of stool-based DNA assays to screen for colon cancer. Strengths and weaknesses of observational study design features are discussed, along with lessons about how features that help assure strength might be “cultivated” in the future.
Conclusions and Impact
By considering these examples and others, it may be possible to develop a process of “cultivating cohorts” - in on-going RCTs, observational cohort studies, and practice settings like HMOs - that have strong features of study design. Such an effort could produce sources of data and specimens to reliably answer questions about the use of molecular markers in diagnosis, prognosis, and response to therapy.
translational research; molecular markers; observational studies; molecular diagnosis; molecular prognosis
A vast range of disorders—from indolent to fast-growing lesions—are labelled as cancer. Therefore, we believe that several changes should be made to the approach to cancer screening and care, such as use of new terminology for indolent and precancerous disorders. We propose the term indolent lesion of epithelial origin, or IDLE, for those lesions (currently labelled as cancers) and their precursors that are unlikely to cause harm if they are left untreated. Furthermore, precursors of cancer or high-risk disorders should not have the term cancer in them. The rationale for this change in approach is that indolent lesions with low malignant potential are common, and screening brings indolent lesions and their precursors to clinical attention, which leads to overdiagnosis and, if unrecognised, possible overtreatment. To minimise that potential, new strategies should be adopted to better define and manage IDLEs. Screening guidelines should be revised to lower the chance of detection of minimal-risk IDLEs and inconsequential cancers with the same energy traditionally used to increase the sensitivity of screening tests. Changing the terminology for some of the lesions currently referred to as cancer will allow physicians to shift medicolegal notions and perceived risk to reflect the evolving understanding of biology, be more judicious about when a biopsy should be done, and organise studies and registries that offer observation or less invasive approaches for indolent disease. Emphasis on avoidance of harm while assuring benefit will improve screening and treatment of patients and will be equally effective in the prevention of death from cancer.
Claims about the diagnostic or prognostic accuracy of markers often prove disappointing when “discrimination” found between cancers versus normals is due to bias, a systematic difference between compared groups. This article describes a framework to help simplify and organize current problems in marker research by focusing on the role of specimens as a source of bias in observational research and using that focus to address problems and improve reliability. The central idea is that the “fundamental comparison” in research about markers (ie, the comparison done to assess whether a marker discriminates) involves two distinct processes that are “connected” by specimens. If subject selection (first process) creates baseline inequality between groups being compared, then laboratory analysis of specimens (second process) may erroneously find positive results. Although both processes are important, subject selection more fundamentally influences the quality of marker research, because it can hardwire bias into all comparisons in a way that cannot be corrected by any refinement in laboratory analysis. An appreciation of the separateness of these two processes—and placing investigators with appropriate expertise in charge of each—may increase the reliability of research about cancer biomarkers.
Policy makers will need to consider if it has one, not only as an adjunct to gFOBT screening, but also as a primary screening test
colorectal cancer; screening; faecal occult blood test
In 2012, the National Cancer Institute (NCI) engaged the scientific community to provide a vision for cancer epidemiology in the 21st century. Eight overarching thematic recommendations, with proposed corresponding actions for consideration by funding agencies, professional societies, and the research community emerged from the collective intellectual discourse. The themes are (i) extending the reach of epidemiology beyond discovery and etiologic research to include multilevel analysis, intervention evaluation, implementation, and outcomes research; (ii) transforming the practice of epidemiology by moving towards more access and sharing of protocols, data, metadata, and specimens to foster collaboration, to ensure reproducibility and replication, and accelerate translation; (iii) expanding cohort studies to collect exposure, clinical and other information across the life course and examining multiple health-related endpoints; (iv) developing and validating reliable methods and technologies to quantify exposures and outcomes on a massive scale, and to assess concomitantly the role of multiple factors in complex diseases; (v) integrating “big data” science into the practice of epidemiology; (vi) expanding knowledge integration to drive research, policy and practice; (vii) transforming training of 21st century epidemiologists to address interdisciplinary and translational research; and (viii) optimizing the use of resources and infrastructure for epidemiologic studies. These recommendations can transform cancer epidemiology and the field of epidemiology in general, by enhancing transparency, interdisciplinary collaboration, and strategic applications of new technologies. They should lay a strong scientific foundation for accelerated translation of scientific discoveries into individual and population health benefits.
big data; clinical trials; cohort studies; epidemiology; genomics; medicine; public health; technologies; training; translational research
A non-invasive blood test that could reliably detect early CRC or large adenomas would provide an important advance in colon cancer screening. The purpose of this study was to determine whether a serum proteomics assay could discriminate among persons with and without a large (≥1cm) colon adenoma. To avoid problems of ‘bias’ that have affected many studies about molecular markers for diagnosis, specimens were obtained from a previously-conducted study of CRC etiology in which bloods had been collected before the presence or absence of neoplasm had been determined by colonoscopy, helping to assure that biases related to differences in sample collection and handling would be avoided. Mass spectra of 65 unblinded serum samples were acquired using a nano-electrospray ionization source on a QSTAR-XL mass spectrometer. Classification patterns were developed using the ProteomeQuest® algorithm, performing measurements twice on each specimen, and then applied to a blinded validation set of 70 specimens. After removing 33 specimens that had discordant results, the “test group” comprised 37 specimens that had never been used in training. Although in the primary analysis no discrimination was found, a single post-hoc analysis, done after hemolyzed specimens had been removed, showed sensitivity of 78%, specificity of 53%, and an accuracy of 63% (95% CI: 53% to 72%). The results of this study, although preliminary, suggest that further study of serum proteomics, in a larger number of appropriate specimens, could be useful. They also highlight the importance of understanding sources of ‘noise’ and ‘bias’ in studies of proteomics assays.
serum profiling; screening; diagnosis; mass spectrometry; colon neoplasm
Quantifying risk of advanced proximal colorectal neoplasia might allow tailoring of colorectal cancer screening, with colonoscopy for those at high risk, and less invasive screening for very low-risk persons.
We analyzed findings from 10,124 consecutive adults age ≥ 50 years who underwent screening colonoscopy to the cecum between September 1995 and August 2008. We quantified the risk of advanced neoplasia (tubular adenoma ≥ 1 cm; a polyp with villous histology or high-grade dysplasia; or adenocarcinoma) both proximally (cecum to splenic flexure) and distally (descending colon to anus). Prevalence of advanced proximal neoplasia was quantified by age, gender and distal findings.
Mean (s.d.) age was 57.5 (6.0) years; 44% were women; 7835 (77%) had no neoplasia, and 1856 (18%) had one or more non-advanced adenomas. Overall, 433 subjects (4.3%) had advanced neoplasia (267 distally; 196 proximally; 30 both), 33 (0.33%) of which were adenocarcinoma (18 distal, 15 proximal). Risk of advanced proximal neoplasia increased with age decade (1.13%, 2.00%, and 5.26%, respectively; P=0.001) and was higher in men (relative risk [RR] =1.91; CI, 1.32–2.77). In women younger than 70 years, risk was 1.1% overall (vs. 2.2% in men; RR=1.98; CI, 1.42–2.76) and was 0.86% in those with no distal neoplasia (vs. 1.54% in men; RR=1.81; CI, 1.20–2.74).
Risk of advanced proximal neoplasia is a function of age and gender. Women younger than age 70 have a very low risk, particularly those with no distal adenoma. Sigmoidoscopy with or without occult blood testing may be sufficient and even preferable for screening these subgroups.
Cancer screening; colonoscopy; colorectal cancer; colorectal neoplasm
After colon cancer screening, large numbers of persons discovered with colon polyps may receive post-polypectomy surveillance with multiple colonoscopy examinations over time. Decisions about surveillance interval are based in part on polyp size, histology, and number.
To learn physicians’ recommendations for post-polypectomy surveillance from physicians’ office charts.
Among 322 physicians performing colonoscopy in 126 practices in N. Carolina, offices of 152 physicians in 55 practices were visited to extract chart data, for each physician, on 125 consecutive persons having colonoscopy in 2003. Subjects included persons with first-time colonoscopy and no positive family history or other indication beyond colonoscopy findings that might affect postpolypectomy surveillance recommendations. Data were extracted about demographics, reason for colonoscopy, family history, symptoms, bowel prep, extent of examination, and features of each polyp including location, size, histology. Recommendations for post-polypectomy surveillance were noted.
Among 10,089 first-time colonoscopy examinations, hyperplastic polyps were found in 4.5% of subjects, in whom follow-up by 4–6 years was recommended in 24%, sooner than recommended in guidelines. Of the 6.6% of persons with only small adenomas, 35% were recommended to return in 1–3 years (sooner than recommended in some guidelines) and 77% by 6 years. Surveillance interval tended to be shorter if colon prep was less than “excellent.” Prep quality was not reported for 32% of examinations.
Surveillance intervals after polypectomy of low-risk polyps may be more aggressive than guidelines recommend. The quality of post-polypectomy surveillance might be improved by increased attention to guidelines, bowel prep, and reporting.
Colonoscopy screening; Colon cancer surveillance; Colonoscopy guidelines; Colonoscopy quality
Although bone mineral density (BMD) testing to screen for osteoporosis (BMD T score, −2.50 or lower) is recommended for women 65 years of age or older, there are few data to guide decisions about the interval between BMD tests.
We studied 4957 women, 67 years of age or older, with normal BMD (T score at the femoral neck and total hip, −1.00 or higher) or osteopenia (T score, −1.01 to −2.49) and with no history of hip or clinical vertebral fracture or of treatment for osteoporosis, followed prospectively for up to 15 years. The BMD testing interval was defined as the estimated time for 10% of women to make the transition to osteoporosis before having a hip or clinical vertebral fracture, with adjustment for estrogen use and clinical risk factors. Transitions from normal BMD and from three subgroups of osteopenia (mild, moderate, and advanced) were analyzed with the use of parametric cumulative incidence models. Incident hip and clinical vertebral fractures and initiation of treatment with bisphosphonates, calcitonin, or raloxifene were treated as competing risks.
The estimated BMD testing interval was 16.8 years (95% confidence interval [CI], 11.5 to 24.6) for women with normal BMD, 17.3 years (95% CI, 13.9 to 21.5) for women with mild osteopenia, 4.7 years (95% CI, 4.2 to 5.2) for women with moderate osteopenia, and 1.1 years (95% CI, 1.0 to 1.3) for women with advanced osteopenia.
Our data indicate that osteoporosis would develop in less than 10% of older, post-menopausal women during rescreening intervals of approximately 15 years for women with normal bone density or mild osteopenia, 5 years for women with moderate osteopenia, and 1 year for women with advanced osteopenia. (Funded by the National Institutes of Health.)
A panel of biomarkers may improve predictive performance over individual markers. Although many biomarker panels have been described for ovarian cancer, few studies used pre-diagnostic samples to assess the potential of the panels for early detection. We conducted a multi-site systematic evaluation of biomarker panels using pre-diagnostic serum samples from the Prostate, Lung, Colorectal, and Ovarian Cancer (PLCO) screening trial.
Using a nested case-control design, levels of 28 biomarkers were measured laboratory-blinded in 118 serum samples obtained before cancer diagnosis and 951 serum samples from matched controls. Five predictive models, each containing 6–8 biomarkers, were evaluated according to a pre-determined analysis plan. Three sequential analyses were conducted: blinded validation of previously established models (Step 1); simultaneous split-sample discovery and validation of models (Step 2); and exploratory discovery of new models (Step 3). Sensitivity, specificity, sensitivity at 98% specificity, and AUC were computed for the models and CA125 alone among 67 cases diagnosed within one year of blood draw and 476 matched controls. In Step 1, one model showed comparable performance to CA125, with sensitivity, specificity and AUC at 69.2%, 96.6% and 0.892, respectively. Remaining models had poorer performance than CA125 alone. In Step 2, we observed a similar pattern. In Step 3, a model derived from all 28 markers failed to show improvement over CA125.
Thus, biomarker panels discovered in diagnostic samples may not validate in pre-diagnostic samples; utilizing pre-diagnostic samples for discovery may be helpful in developing validated early detection panels.
Early Detection; Screening; Biomarkers; Validation; Study Design
Variations in mammography interpretations may have important clinical and economic implications. To evaluate international variability in mammography interpretation, we analyzed published reports from community-based screening programs from around the world.
A total of 32 publications were identified in MEDLINE that fit the study inclusion criteria. Data abstracted from the publications included features of the population screened, examination technique, and clinical outcomes, including the percentage of mammograms judged to be abnormal, positive predictive value of an abnormal mammogram (PPVA), positive predictive value of a biopsy performed (PPVB), and percentages of breast cancer patients with ductal carcinoma in situ (DCIS) and minimal disease (DCIS and/or tumor size ≤10 mm). North American screening programs were compared with those from other countries using meta-regression analysis. All statistical tests were two-sided.
Wide ranges were noted for the percentage of mammograms judged to be abnormal (1.2%–15.0%), for PPVA (3.4%–48.7%), for PPVB (5.0%–85.2%), for percentage diagnosed with DCIS (4.3%–68.1%), and for percentage diagnosed with minimal disease (14.0%–80.6%). The percentage of mammograms judged to be abnormal were 2–4 percentage points higher in North American screening programs than they were in programs from other countries, after adjusting for covariates such as percentage of women who were less than 50 years of age and calendar year in which the mammogram was performed. The percentage of mammograms judged to be abnormal had a negative association with PPVA and PPVB (both P <.001) and a positive association with the frequency of DCIS cases diagnosed (P = .008) and the number of DCIS cases diagnosed per 1000 screens (P = .024); no consistent relationship was observed with the proportion of breast cancer diagnoses reported as having minimal disease or the number of minimal disease cases diagnosed per 1000 screens.
North American screening programs appear to interpret a higher percentage of mammograms as abnormal than programs from other countries without evident benefit in the yield of cancers detected per 1000 screens, although an increase in DCIS detection was noted.
Early detection of cancer has held great promise and intuitive appeal in the medical community for well over a century. Its history developed in tandem with that of the periodic health examination, in which any deviations—subtle or glaring--from a clearly demarcated “normal” were to be rooted out, given the underlying hypothesis that diseases develop along progressive linear paths of increasing abnormalities. This model of disease development drove the logical deduction that early detection—by “breaking the chain” of cancer development--must be of benefit to affected individuals. In the latter half of the 20th century, researchers and guidelines organizations began to explicitly challenge the core assumptions underpinning many clinical practices. A move away from intuitive thinking began with the development of evidence-based medicine. One key method developed to explicitly quantify the overall risk-benefit profile of a given procedure was the analytic framework. The shift away from pure deductive reasoning and reliance on personal observation was driven, in part, by a rising awareness of critical biases in cancer screening that can mislead clinicians, including healthy volunteer bias, length-biased sampling, lead-time bias, and overdiagnosis. A new focus on the net balance of both benefits and harms when determining the overall worth of an intervention also arose: it was recognized that the potential downsides of early detection were frequently overlooked or discounted because screening is performed on basically healthy persons and initially involves relatively noninvasive methods. Although still inconsistently applied to early detection programs, policies, and belief systems in the United States, an evidence-based approach is essential to counteract the misleading—even potentially harmful--allure of intuition and individual observation.
The increasing availability of personal genomic tests has led to discussions about the validity and utility of such tests and the balance of benefits and harms. A multidisciplinary workshop was convened by the National Institutes of Health and the Centers for Disease Control and Prevention to review the scientific foundation for using personal genomics in risk assessment and disease prevention and to develop recommendations for targeted research. The clinical validity and utility of personal genomics is a moving target with rapidly developing discoveries but little translation research to close the gap between discoveries and health impact. Workshop participants made recommendations in five domains: (1) developing and applying scientific standards for assessing personal genomic tests; (2) developing and applying a multidisciplinary research agenda, including observational studies and clinical trials to fill knowledge gaps in clinical validity and utility; (3) enhancing credible knowledge synthesis and information dissemination to clinicians and consumers; (4) linking scientific findings to evidence-based recommendations for use of personal genomics; and (5) assessing how the concept of personal utility can affect health benefits, costs, and risks by developing appropriate metrics for evaluation. To fulfill the promise of personal genomics, a rigorous multidisciplinary research agenda is needed.
behavioral sciences; epidemiologic methods; evidence-based medicine; genetics; genetic testing; genomics; medicine; public health
Colorectal cancer (CRC) screening has been supported by strong research evidence and recommended in clinical practice guidelines for more than a decade. Yet screening rates in the United States remain low, especially relative to other preventable diseases such as breast and cervical cancer. To understand the reasons, the National Cancer Institute and Agency for Healthcare Research and Quality sponsored a review of CRC screening implementation in primary care and a program of research funded by these organizations. The evidence base for improving CRC screening supports the value of a New Model of Primary Care Delivery: 1. a team approach, in which responsibility for screening tasks is shared among other members of the practice, would help address physicians’ lack of time for preventive care; 2. information systems can identify eligible patients and remind them when screening is due; 3. involving patients in decisions about their own care may enhance screening participation; 4. monitoring practice performance, supported by information systems, can help target patients at increased risk because of family history or social disadvantage; 5. reimbursement for services outside the traditional provider—patient encounter, such as telephone and e-mail contacts, may foster enhanced screening delivery; 6. training opportunities in communication, cultural competence, and use of information technologies would improve provider competence in core elements of screening programs. Improvement in CRC screening rates largely depends on the efforts of primary care practices to implement effective systems and procedures for screening delivery. Active engagement and support of practices are essential for the enormous potential of CRC screening to be realized.
colorectal cancer; screening; primary care; prevention
As screening methods for colorectal cancer (CRC) are limited by uptake and adherence, further options are sought. A blood test might increase both, but none has yet been tested in a screening setting.
We prospectively assessed the accuracy of circulating methylated SEPT9 DNA (mSEPT9) for detecting CRC in a screening population.
Asymptomatic individuals ≥50 years old scheduled for screening colonoscopy at 32 US and German clinics voluntarily gave blood plasma samples before colon preparation. Using a commercially available assay, three independent blinded laboratories assayed plasma DNA of all CRC cases and a stratified random sample of other subjects in duplicate real time PCRs. The primary outcomes measures were standardised for overall sensitivity and specificity estimates.
7941 men (45%) and women (55%), mean age 60 years, enrolled. Results from 53 CRC cases and from 1457 subjects without CRC yielded a standardised sensitivity of 48.2% (95% CI 32.4% to 63.6%; crude rate 50.9%); for CRC stages I–IV, values were 35.0%, 63.0%, 46.0% and 77.4%, respectively. Specificity was 91.5% (95% CI 89.7% to 93.1%; crude rate 91.4%). Sensitivity for advanced adenomas was low (11.2%).
Our study using the blood based mSEPT9 test showed that CRC signal in blood can be detected in asymptomatic average risk individuals undergoing screening. However, the utility of the test for population screening for CRC will require improved sensitivity for detection of early cancers and advanced adenomas.
Clinical Trial Registration Number:
Colorectal Cancer Screening; Methylation; Colonoscopy; Colorectal Adenomas; Colorectal Neoplasm