Search tips
Search criteria 


Logo of amjepidLink to Publisher's site
Am J Epidemiol. 2015 July 1; 182(1): 35–38.
Published online 2015 June 6. doi:  10.1093/aje/kwv028
PMCID: PMC4479116

Invited Commentary: Clinical Utility of Prediction Models for Rare Outcomes—The Example of Pancreatic Cancer


Translating relative risk estimates into absolute risks is important in evaluating the potential clinical and public health relevance of etiologic discoveries. Predicting high absolute risk is challenging, particularly for rare endpoints such as pancreatic cancer. Recent efforts to develop risk prediction models for pancreatic cancer have found moderate risk levels for very small parts of the population. A new approach in which clinical symptoms and medication use are evaluated in addition to information on risk factors is presented by Risch et al. in this issue of the Journal (Am J Epidemiol. 2015;182(1):26–34). The authors estimated absolute risks based on the relative risks obtained from their case-control study. Their absolute risk estimates were higher than those from previous approaches but remained restricted to a very small proportion of the general population. In the present commentary, we address issues of absolute risk stratification (particularly for rare diseases), specific analytic methods, and how actionable information will differ based on the disease and possible intervention. We suggest that moving from cancer-specific models to broader models used to predict risk for multiple outcomes can make risk prediction for rare diseases more effective. When considering translational goals, it is important to estimate absolute risk at the early stages of etiologic research. The results can be sobering but allow focusing on the most promising goals.

Keywords: absolute risk, clinical relevance, rare endpoints, risk prediction, risk stratification, translational epidemiology

Recognition of a cancer before it can be diagnosed by clinical symptoms reduces morbidity and mortality rates. This central paradigm of cancer prevention and control has led to major efforts to identify risk markers that can improve early detection. However, true success stories are rare. The most successful cancer screening programs rely on identification and removal of cancer precursors, such as cervical intraepithelial neoplasias and colorectal polyps, to prevent invasive cancers from occurring.

Pancreatic cancer is a disease that could particularly benefit from a successful early detection strategy. It is a rare cancer that is typically detected after it has reached advanced stages. Overall, the 5-year survival rate is approximately 7%. The survival rate improves to 25% when the cancer is detected at an earlier localized stage (1).

So far, there is no test that has been successfully used to lower the rate of mortality from pancreatic cancer, and consequently there are no recommendations for pancreatic cancer screening despite many efforts to identify curable pancreatic cancer precursor lesions. Endoscopic ultrasound can be used to detect intraductal papillary mucinous neoplasms and mucinous cystic neoplasms, 2 conditions that can progress to pancreatic cancer. However, the majority of pancreatic cancers begin as microscopic (<5 mm) pancreatic intraepithelial neoplasias, which are not easily visible with current imaging techniques. Molecular markers in pancreatic fluid are being studied to see if they can be used to detect these neoplasias, but there are limited data on their performance (2).

A major challenge for successful early detection of pancreatic cancer is its low prevalence in the general population. For persons older than 65 years of age, the 5-year risk of pancreatic cancer is near 0.1% (1). In population-based screening, a marker-based test with 90% sensitivity and 95% specificity would theoretically achieve a positive predictive value of 2%. Increasing the prior risk (equal to the disease prevalence) by a factor of 10 would result in an absolute risk of 15% for the same test. Therefore, identification of subsets of populations at increased risk of pancreatic cancer that could benefit from early detection has promise.

Obvious target populations are individuals with genetic syndromes or a strong family history. Subjects with familial adenomatous polyposis, familial atypical multiple-mole melanoma syndrome, Peutz-Jeghers syndrome, or ataxia telangiectasia are at greater risk for pancreatic cancer. In addition, having 2 first-degree relatives diagnosed with pancreatic cancer increases a person's risk 10-fold compared with the risk in the general population. However, these conditions are rare and only account for a small portion of all pancreatic cancers (3).

Risk prediction models based on risk factor data and genetic susceptibility loci have been evaluated for the general population. The Pancreatic Cancer Cohort Consortium for Genome-Wide Association Studies (PanScan) developed a risk prediction model for pancreatic cancer that included smoking status, alcohol use, obesity, diabetes, ABO blood type, family history, and 3 genetic susceptibility loci (4). The model used risk estimates from the same population and did not externally validate risk prediction, thereby likely overestimating performance. Still, the risk stratification of the model was limited: Identification of only 0.3% of the population with a 5% or higher lifetime risk of pancreatic cancer has very limited clinical value.

In the current issue of the Journal, Risch et al. (5) derived absolute risk estimates for pancreatic cancer from logistic regression models that incorporated data from their pancreatic cancer case-control study. Their risk model differed from the previously published model and included Jewish ancestry, ABO blood group, smoking status, diabetes, pancreatitis, and use of proton-pump inhibitors. For the latter 4 variables, a time component was included (i.e., time since cessation, since diagnosis, or since first use). The authors presented some combinations of these risk factors in which the estimated absolute risk of pancreatic cancer in the next 5 years was approximately 10%, which is considerably higher than the absolute risk estimates from the previous risk model. However, these combinations were very rare in the general population: Fewer than 1% had 5-year estimated absolute risks of 5%–10%.

Some limitations of their study are noteworthy. The risk estimates underlying the risk model are based on a single case-control study only. Survival bias is of concern for a rapidly fatal disease such as pancreatic cancer because risk prediction might differ for the most fatal cases, who cannot be enrolled in a case-control study. Recall bias could affect information on lifestyle factors, medication use, and other important variables. Importantly, alcohol use and family history, 2 important contributors to the previous risk model, were not considered here. As with the previous risk model, the risk estimates were not calibrated and externally validated. To address this criticism, the authors present results from previous studies to demonstrate the plausibility of their risk estimates. In summary, their model likely overestimates the absolute risk of pancreatic cancer that would be predicted in an independent population model.

The analysis of the timing of some of the exposures is very important for understanding the natural history of pancreatic cancer. However, the importance of recent diabetes and pancreatitis diagnoses, smoking cessation, and initiation of proton-pump inhibitor use shown by Risch et al. (5) poses a challenge for the prospective use of such variables. Through a complex interaction term, the model estimated high risks for those recently diagnosed and lower risks for those who have lived with a condition for years. However, at the time of initial diagnosis, it is impossible to know how long a patient will live with a condition. A better estimate of pancreatic cancer in relation to the onset of clinical symptoms could be derived from prospectively evaluating pancreatic cancer rates after the initial diagnosis of symptoms.

Despite these limitations and risk estimates that might be overestimated, Risch et al. (5) are moving in the right direction by applying their relative risk estimates to predict absolute risk of pancreatic cancer in the general population. The difficulty of predicting high absolute risk of pancreatic cancer serves as an example for the widespread challenge of how to improve the rates of mortality from rare, deadly cancers using risk prediction and early detection.

A lot of work in epidemiology now puts a major focus on finding risk markers that can be translated into clinical or public health practice. However, there is often a disconnect between the initial work on risk factors and the translational efforts to put them into use. Although one may get carried away by moderate relative risk estimates for rare diseases, moving from relative risk to absolute risk serves as an important reality check, as evidenced by the analysis of Risch et al.

Absolute risk stratification is important for translational goals. A risk model or a biomarker separates the population with a certain prior risk into a range of posterior risks. A commonly used measure of risk stratification is the area under the receiver operating characteristic curve (AUC). The AUC shows discrimination between cases and controls over the whole range of risk marker thresholds used to define an elevated risk or a positive test result. The receiver operating characteristic curve is defined by the distribution of a risk marker in cases and controls. This can be measured as the distance between the risk marker's means among cases and controls, standardized by a function of the variation within cases and controls (referred to as δ) (6). The same δ and AUC values translate to very different absolute risks depending on the population disease prevalence. Figure 1 shows the positive predictive value, or posterior risk, that can be achieved with a risk marker for a range of disease prevalences from 0.1% to 1% (representing the risk of pancreatic cancer in the general population and among individuals with a family history). To achieve an absolute risk of 10%, very high specificity is required, even when the disease prevalence approaches 1%.

Figure 1.
Positive predictive value of a risk marker in relation to disease prevalence and test specificity. For a rare disease, the positive predictive value strongly depends on the test specificity and the disease prevalence. The plots show the positive predictive ...

Importantly, risk stratification is only meaningful when the differences in risk lead to differences in clinical management or public health recommendations. Actionable risk levels are very different for different diseases, which highlights the importance of considering the specific clinical situation when evaluating absolute risk. Comparing AUCs and absolute risk estimates across different diseases can be very misleading, and in general, more risky or costly interventions require higher absolute risks to be justified. A risk marker can also be used to rule out disease and provide reassurance that cancer risk is very low in the next few years. An example is human papillomavirus (HPV) testing for cervical cancer. HPV is a cause of most cervical cancers, and HPV DNA testing has very high sensitivity for the disease. The vast majority of the population is HPV-negative and therefore does not require repeat screening for several years (7).

In their article, Risch et al. (5) do not elaborate on the steps that would follow after application of their risk model in a population. There are no established risk thresholds for management of a higher risk of pancreatic cancer, but it is conceivable that endoscopic ultrasound or other diagnostic procedures could be used to evaluate individuals with a 5%–10% risk of pancreatic cancer.

The other important aspect of absolute risk prediction is the percentage of the population with high absolute risks that might benefit from interventions. As shown in the article, the group of subjects with potentially actionable risk is very small. Importantly, most pancreatic cancer cases do not occur in this small population. Therefore, there is basically no reassurance for individuals who do not belong to these risk groups that their pancreatic cancer risk is much lower than that of the general population. Thus, if there is any benefit of the risk prediction approach, it will be limited to the small group of subjects with increased risk.

An important question is at whom a pancreatic cancer risk assessment tool would be targeted. Individuals most concerned about their pancreatic cancer risk are likely those with a family history and those with chronic pancreatitis. Population-wide use of a risk assessment tool would likely have very little additional benefit. In the future, risk assessment may be done more broadly, simultaneously considering many different endpoints rather than focusing on individual rare outcomes. Many risk factors, such as smoking, are associated with multiple cancer types, and therefore collecting a slightly broader set of input data instead of using individual cancer risk models could lead to multiple risk assessment outputs with broader relevance for the general population. Using this approach, identifying small populations with higher risks of rare outcomes may still be effective, because there is a benefit for the general population due to risk assessment for multiple and more common endpoints.

One presumption of the article by Risch et al. is that the number of deaths from pancreatic cancer can be reduced by using risk prediction and early detection approaches. However, considering the dismal prognosis of pancreatic cancer and the high-risk intervention (major abdominal surgery), it is completely unclear whether the detection of increased risk in this small group justifies a general population-screening program that might reduce mortality.

Achieving a survival benefit for rare cancers by screening remains a major challenge; pancreatic cancer appears to be a particularly tough case. Risk models that are focused on rare cancer outcomes usually do not offer clinically meaningful risk stratification. However, even if small groups with a high enough absolute risk can be identified, it is a big leap to achieve detection early enough in the natural history to change the outcome of the disease.

In general, absolute risks allow realistic estimation of translational goals. Risch et al. (5) derived absolute risk estimates from etiologic studies by combining relative risk measures with population-based cancer incidence data from the Surveillance, Epidemiology, and End Results Program. The results are often very sobering, but they allow us to focus our work on the most promising efforts and abandon approaches that are likely to fail.


Author affiliations: Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland (Nicolas Wentzensen, Ronald C. Eldridge).

Both authors contributed equally to this work.

The work was supported by the Intramural Research Program of the National Cancer Institute.

Conflict of interest: none declared.


1. Howlader NNA, Krapcho M, Garshell J, et al. SEER Cancer Statistics Review, 1975–2011. Bethesda, MD, Based on November 2013 SEER data submission. Published April 2014. Accessed December 11, 2014.
2. Vincent A, Herman J, Schulick R, et al. Pancreatic cancer. Lancet. 2011;3789791:607–620. [PMC free article] [PubMed]
3. Raimondi S, Maisonneuve P, Lowenfels AB. Epidemiology of pancreatic cancer: an overview. Nat Rev Gastroenterol Hepatol. 2009;612:699–708. [PubMed]
4. Klein AP, Lindström S, Mendelsohn JB, et al. An absolute risk model to identify individuals at elevated risk for pancreatic cancer in the general population. PLoS One. 2013;89:e72311. [PMC free article] [PubMed]
5. Risch HA, Yu H, Lu L, et al. Detectable symptomatology preceding the diagnosis of pancreatic cancer and absolute risk of pancreatic cancer diagnosis. Am J Epidemiol. 2015;1821:26–34. [PMC free article] [PubMed]
6. Wentzensen N, Wacholder S. From differences in means between cases and controls to risk stratification: a business plan for biomarker development. Cancer Discov. 2013;32:148–157. [PMC free article] [PubMed]
7. Wentzensen N. Triage of HPV-positive women in cervical cancer screening. Lancet Oncol. 2013;142:107–109. [PMC free article] [PubMed]
8. Division of Cancer Epidemiology and Genetics, National Cancer Institute. Risk stratification analysis options. Accessed December 11, 2014.

Articles from American Journal of Epidemiology are provided here courtesy of Oxford University Press