Search tips
Search criteria 


Logo of jepidemJournal of Epidemiology
J Epidemiol. 2017 March; 27(3 Suppl): S71–S76.
Published online 2016 December 27. doi:  10.1016/
PMCID: PMC5350588

Risk prediction models for mortality in patients with cardiovascular disease: The BioBank Japan project



Cardiovascular disease (CVD) is a leading cause of death in Japan. The present study aimed to develop new risk prediction models for long-term risks of all-cause and cardiovascular death in patients with chronic phase CVD.


Among the subjects registered in the BioBank Japan database, 15,058 patients aged ≥40 years with chronic ischemic CVD (ischemic stroke or myocardial infarction) were divided randomly into a derivation cohort (n = 10,039) and validation cohort (n = 5019). These subjects were followed up for 8.55 years in median. Risk prediction models for all-cause and cardiovascular death were developed using the derivation cohort by Cox proportional hazards regression. Their prediction performances for 5-year risk of mortality were evaluated in the validation cohort.


During the follow-up, all-cause and cardiovascular death events were observed in 2962 and 962 patients from the derivation cohort and 1536 and 481 from the validation cohort, respectively. Risk prediction models for all-cause and cardiovascular death were developed from the derivation cohort using ten traditional cardiovascular risk factors, namely, age, sex, CVD subtype, hypertension, diabetes, total cholesterol, body mass index, current smoking, current drinking, and physical activity. These models demonstrated modest discrimination (c-statistics, 0.703 for all-cause death; 0.685 for cardiovascular death) and good calibration (Hosmer-Lemeshow χ2-test, P = 0.17 and 0.15, respectively) in the validation cohort.


We developed and validated risk prediction models of all-cause and cardiovascular death for patients with chronic ischemic CVD. These models would be useful for estimating the long-term risk of mortality in chronic phase CVD.

Keywords: Risk prediction model, Ischemic stroke, Myocardial infarction, All-cause death, Cardiovascular death

1. Introduction

Survival after the onset of cardiovascular disease (CVD), such as stroke and myocardial infarction, has been prolonged during the past several decades as a result of improvements in medical technology and living environments.1 However, CVD remains one of the leading causes of mortality in Japan as well as other countries around the world. Risk prediction models would thus be useful to estimate the future risk of mortality and to plan a strategy for treatment and lifestyle modification in CVD patients in the chronic phase as well as in those in an acute phase. A number of risk prediction models have been developed to identify individuals at higher risk of short-term mortality (in-hospital mortality or death within 1 year) after acute events of CVD.2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 However, no previous studies have developed risk prediction models for long-term risk of mortality among patients in the chronic phase of CVD. The aim of the present study was to develop new risk prediction models for long-term risks of all-cause and cardiovascular death among patients in the chronic phase of ischemic CVD (ischemic stroke and myocardial infarction) using the data from the BioBank Japan (BBJ), a large-scale registry for common diseases in Japan.

2. Methods

2.1. Study participants

The BBJ Project was established with the cooperation of 12 medical institutes in Japan as a Leading Project of the Ministry of Education, Culture, Sports, Science and Technology of Japan. From June 2003 to March 2008, approximately 200,000 patients with any of 47 target common diseases were registered in the BBJ registry. The detailed study design and protocol were described elsewhere.17, 18, 19 In this registry, 29,065 patients were registered as having ischemic stroke, myocardial infarction or both. After excluding 269 patients aged ≤39 years at registration, 4303 who experienced an event of ischemic stroke or myocardial infarction within 90 days before the registration, 7594 with missing data in any of the risk factors for the present study, and 1841 without follow-up data, the remaining 15,058 patients aged ≥40 years with chronic CVD (7374 with ischemic stroke only, 7251 with myocardial infarction only, 433 with both ischemic stroke and myocardial infarction) were eligible for the present study. Here, patients with a recent CVD event were excluded because risk factor profiles were likely to be unstable in the acute phase of CVD.

2.2. Risk factors

To develop risk prediction models, we selected ten conventional cardiovascular risk factors collected at registration, namely, age, sex, CVD subtype, hypertension, diabetes mellitus, total cholesterol, body mass index (BMI), current smoking, current drinking, and physical activity. Age at registration was categorized into five groups, namely, 40–49, 50–59, 60–69, 70–79, and ≥80 years. Ischemic stroke, myocardial infarction and diabetes were diagnosed by local physicians. Medical history of ischemic stroke and myocardial infarction was collected by medical coordinators participating in the BBJ Project. CVD subtype was classified into three groups, namely, ischemic stroke only, myocardial infarction only, and both ischemic stroke and myocardial infarction. Hypertension was defined as blood pressure ≥140/90 mmHg or use of antihypertensive medication at registration. Total cholesterol was measured at each hospital and categorized into four groups, namely, <180, 180–199, 200–219, and ≥220 mg/dL. BMI was calculated as body weight (in kilograms) divided by squared height (in meters) and classified into three groups, namely, <18.5, 18.5–24.9, and ≥25.0 kg/m2. Subjects engaging in any sports activity at least once a week made up a physically active group.

2.3. Follow-up surveys and study outcomes

The detailed protocols for follow-up surveys were described elsewhere.17, 19 Information on survival status, last date of follow-up or date of death, and cause of death were obtained from medical records in participating hospitals, resident registration in local governments, and vital statistics from the Ministry of Health, Labour and Welfare of Japan. Primary and secondary outcomes of the present study were all-cause death and cardiovascular death, respectively. Cause of death was recorded using the Tenth Revision of the International Classification of Diseases (ICD-10). Cardiovascular death was defined as a mortality event with an ICD-10 code of I00-I99 (diseases of the circulatory system).

2.4. Statistical analysis

Two-thirds of the study participants (n = 10,039) were randomly assigned to a derivation cohort for the development of risk prediction models with computer-generated random numbers, and the remaining one-third (n = 5019) were reserved as an independent validation cohort. Among the patients allocated to the derivation cohort, new risk prediction models for all-cause and cardiovascular death were developed by using multivariable Cox proportional hazards models including the above-mentioned risk factors. In these models, survival analysis was censored at last date of follow-up or date of death. The performance of each risk prediction model was tested among either the patients assigned to the derivation or the validation cohort. The ability of each model to discriminate patients who died within 5 years from surviving patients was evaluated using the c-statistic for the survival analysis.20 Observed risk (based on the Kaplan-Meier product limit method) and predicted risk (based on the risk prediction model) of each outcome at 5 years were compared by ranking participants into deciles of predicted risk, and calibration of each model was evaluated in the derivation cohort using a modified Hosmer-Lemeshow χ2-statistic with 9 degree of freedom.21 In addition, the risk prediction model for each outcome was translated into a simple score sheet in the same way as described in the Framingham Heart Study.22 All statistical analyses were performed with SAS 9.3 (SAS Institute, Cary, NC). Two-sided values of P < 0.05 were considered statistically significant.

2.5. Ethical considerations

The study protocol of the BBJ Project was approved by the research ethics committees at the University of Tokyo, RIKEN Yokohama Institute, Kyushu University, and other research institutes and cooperating hospitals participating in this project. All participants gave written informed consent.

3. Results

Baseline characteristics of the participants in the derivation cohort and the validation cohort are shown in Table 1. Mean age was 69 years and 72% were men in both cohorts. There were no clear differences in the characteristics of traditional risk factors between the two cohorts.

Table 1
Baseline characteristics of participants in the derivation cohort and the validation cohort.

The median (range) of the follow-up period was 8.55 (0.01–11.53) years for the derivation cohort and 8.54 (0.03–11.35) years for the validation cohort. During the follow-up, all-cause and cardiovascular death were observed in 2962 and 962 patients from the derivation cohort and in 1536 and 481 patients from the validation cohort, respectively.

The results from the multivariable Cox proportional hazards model for all-cause and cardiovascular death in the derivation cohort are summarized in Table 2. The risk of all-cause death increased significantly with aging and was higher in men than in women. Compared with those with a history of ischemic stroke only or myocardial infarction only, patients with a history of both ischemic stroke and myocardial infarction had significantly higher risk of all-cause death. Hypertension, diabetes, and current smoking were significant risk factors for all-cause death. In contrast, the levels of total cholesterol and BMI were both inversely associated with mortality risk. Current alcohol intake and physical exercise were protective factors for mortality. Similar associations were observed for cardiovascular death, although the risk estimates in the lowest cholesterol levels appeared to be marginally significant (P = 0.06).

Table 2
Multivariable-adjusted hazard ratios for all-cause and cardiovascular death in the derivation cohort.

The performance of the risk prediction models was then evaluated for each of the derivation and validation cohorts. The c-statistics for the 5–year risk of all-cause death were 0.710 (95% confidence interval [CI], 0.697–0.723) in the derivation cohort and 0.703 (95% CI, 0.686–0.721) in the validation cohort, and those of cardiovascular death were 0.698 (95% CI, 0.676–0.719) and 0.685 (95% CI, 0.654–0.715), respectively, indicating that both models had modest discrimination abilities. Fig. 1 demonstrates the calibration plots comparing observed and predicted 5-year risks of all-cause and cardiovascular death. The modified Hosmer-Lemeshow χ2-statistics (degree of freedom = 9) in the validation cohort were 12.9 (P = 0.17) for all-cause death and 13.3 (P = 0.15) for cardiovascular death, indicating that both models had good calibration performances.

Fig. 1
Observed and predicted 5-year risks of all-cause and cardiovascular death by deciles of risk in the validation cohorts. Black bar indicates the observed 5-year risk and white bar indicates the predicted 5-year risk by dividing deciles of risk.

Table 3, Table 4 provide simple risk score sheets which can be used for estimation of the 5-year risk of all-cause and cardiovascular death. For example, a 55-year-old man with a history of myocardial infarction, hypertension, diabetes, total cholesterol of 210 mg/dL, and BMI of 26.0 kg/m2, who currently smoked and was physically inactive but never drank alcohol beverages would have a total risk score of 10 for all-cause death and 8 for cardiovascular death (Table 3). His 5-year risks would be 8.2% for all-cause death and 4.5% for cardiovascular death (Table 4).

Table 3
Simple risk scores for predicting 5-year risks of all-cause and cardiovascular death.
Table 4
Predicted 5-year risks of all-cause and cardiovascular death according to the sum of the risk scores in Table 3.

4. Discussion

In the present study, we developed a risk prediction model for all-cause death and that for cardiovascular death using a large cohort of patients in the chronic phase of ischemic CVD. These models included ten traditional cardiovascular risk factors that are commonly used in clinical practice. These models had modest abilities of discrimination and good calibration in the independent validation cohort. Since CVD is one of the major causes of death in Japan, these models would be useful to identify patients in the chronic phase of CVD at high risk of mortality.

A number of risk prediction models have been developed to predict a future risk of CVD events. For example, the Framingham Risk Score23 is a well-known tool for predicting the long-term risk of CVD events based on a population-based prospective study. Some risk prediction models for CVD have also been developed using the data from population-based studies in Japan.24, 25 On the other hand, for mortality risk after CVD, the currently available models were developed primarily to predict relatively short-term risk of mortality after acute coronary syndrome2, 3, 4, 5, 6, 7, 8 or acute stroke.9, 10, 11, 12, 13, 14, 15, 16 No previous studies have developed models for predicting the long-term risk of mortality among patients in the chronic phase of CVD. To the best of our knowledge, therefore, the present study is the first to develop risk prediction models that could be used to predict long-term risks of all-cause and cardiovascular death for patients with chronic ischemic CVD.

Hypertension, diabetes, smoking, and physical inactivity are established risk factors for the development of atherosclerotic diseases.26 Therefore, these factors are likely to be associated with recurrence of events of stroke, coronary heart disease, or other CVDs, including heart failure, kidney failure, or peripheral artery diseases, resulting in a higher risk of mortality. Alcohol intake is known to be a protective factor against atherosclerotic diseases.26 On the other hand, while hypercholesterolemia and obesity are well-known risk factors for the development of CVD,26 the risk of mortality in the present study increased significantly in patients with lower cholesterol levels or with lower BMI. These paradoxical associations have also been reported in several prospective studies,27, 28, 29, 30, 31, 32, 33 and might be attributable to reverse causality. That is, patients with severe CVD or higher risk of mortality are likely to have poor nutrition or to be treated more strictly by lipid-lowering medication, which would lead to lower cholesterol and BMI levels. Therefore, lower cholesterol levels and lower BMI levels could be indicators for mortality in patients with chronic ischemic CVD.

Some limitations of the present study should be discussed. First, some important parameters associated with mortality risk were not included in the models. For example, measurements for severity, such as cardiac systolic function for patients with myocardial infarction and neurological symptoms for patients with ischemic stroke, and subtype of ischemic stroke, which were likely to be associated with mortality risk, were not used in the present study. Unfortunately, these data were missing in a large number of patients since they were not always described in the medical records. Therefore, this limitation has a possibility to induce the discrimination performance in our models modest (c-statistics, approximately 0.7). Second, the generalizability of our model may be uncertain, although the discrimination and calibration performances of our model have been internally validated using the split samples of a multicenter hospital-based large-scale cohort study. External validations using other cohorts with different background characteristics would be necessary to confirm the generalizability of our models. Third, since the BBJ is an observational study, we were not able to examine the effects of treatment intervention and lifestyle modification. Fourth, information on CVD recurrence and the incidence of other diseases during the follow-up was not taken into consideration in the present study. Finally, the diagnostic procedures for pre-existing CVD at registration was not standardized across the hospitals.

In conclusion, we developed and validated new risk prediction models for all-cause and cardiovascular death among chronic ischemic CVD patients using the BBJ data. These models would provide a useful guide to identify patients at high risk of mortality.


This study was supported by the funding from the Ministry of Education, Culture, Sports, Science, and Technology (from 2003 to March 2015) and the Japan Agency for Medical Research and Development, AMED (since April 2015).


We express our gratitude to all participants of the BioBank Japan Project. We thank all medical coordinators of cooperative hospitals for collecting samples and clinical information, Yasushi Yamashita and staff members of the BioBank Japan Project for the administrative support. We also thank Dr. Kumao Toyoshima for his overall supervision of the BioBank Japan project.


Members of medical institutions cooperating on the BioBank Japan Project who co authored this paper include Shigeru Saito, Hideki Shimomura, Sinichi Higashiue and Kazuo Misumi (Tokushukai Hospitals); Shiro Minami, Masahiro Yasutake and Hitoshi Takano (Nippon Medical School); Kazunori Shimada, Hakuoh Konishi and Nobukazu Miyamoto (Juntendo University); Satoshi Asai, Mitsuhiko Moriyama and Yasuo Takahashi (Nihon University); Tomoaki Fujioka and Wataru Obara (Iwate Medical University); Seijiro Mori and Hideki Ito (Tokyo Metropolitan Institute of Gerontology); Satoshi Nagayama and Yoshio Miki (The Cancer Institute Hospital of JFCR); Akihide Masumoto and Akira Yamada (Aso Iizuka Hospital); Yasuko Nishizawa and Ken Kodama (Osaka Medical Center for Cancer and Cardiovascular Diseases); Yoshihisa Sugimoto and Takashi Ashihara (Shiga University of Medical Science); Yukihiro Koretsune and Sachiko Ikeda (National Hospital Organization, Osaka National Hospital); and Ryozo Yano (Fukujuji Hospital).


1. Hata J., Ninomiya T., Hirakawa Y. Secular trends in cardiovascular disease and its risk factors in Japanese: half-century data from the Hisayama Study (1961–2009) Circulation. 2013;128:1198–1205. [PubMed]
2. Eagle K.A., Lim M.J., Dabbous O.H. A validated prediction model for all forms of acute coronary syndrome: estimating the risk of 6-month postdischarge death in an international registry. JAMA. 2004;291:2727–2733. [PubMed]
3. Fox K.A., Dabbous O.H., Goldberg R.J., Pieper K.S., Eagle K.A., Van de Werf F. Prediction of risk of death and myocardial infarction in the six months after presentation with acute coronary syndrome: prospective multinational observational study (GRACE) BMJ. 2006;333:1091–1094. [PubMed]
4. Morrow D.A., Antman E.M., Charlesworth A. TIMI risk score for ST-elevation myocardial infarction: a convenient, bedside, clinical score for risk assessment at presentation: an Intravenous nPA for treatment of infarcting myocardium early II trial substudy. Circulation. 2000;102:2031–2037. [PubMed]
5. Boersma E., Pieper K.S., Steyerberg E.W. Predictors of outcome in patients with acute coronary syndromes without persistent ST-segment elevation: results from an international trial of 9461 patients. Circulation. 2000;101:2557–2567. [PubMed]
6. Morrow D.A., Antman E.M., Giugliano R.P. A simple risk index for rapid initial triage of patients with ST-elevation myocardial infarction: an InTIME II substudy. Lancet. 2001;358:1571–1575. [PubMed]
7. Lee K.L., Woodlief L.H., Topol E.J. Predictors of 30-day mortality in the era of reperfusion for acute myocardial infarction: results from an international trial of 41,021 patients. Circulation. 1995;91:1659–1668. [PubMed]
8. Antman E.M., Cohen M., Bernink P.J. The TIMI risk score for unstable angina/non-ST elevation MI: a method for prognostication and therapeutic decision making. JAMA. 2000;284:835–842. [PubMed]
9. Lee J., Morishima T., Kunisawa S. Derivation and validation of in-hospital mortality prediction models in ischaemic stroke patients using administrative data. Cerebrovasc Dis. 2013;35:73–80. [PubMed]
10. Smith E.E., Shobha N., Dai D. A risk score for in-hospital death in patients admitted with ischemic or hemorrhagic stroke. J Am Heart Assoc. 2013;2:e005207. [PubMed]
11. Counsell C., Dennis M., McDowall M., Warlow C. Predicting outcome after acute and subacute stroke: development and validation of new prognostic models. Stroke. 2002;33:1041–1047. [PubMed]
12. Smith E.E., Shobha N., Dai D. Risk score for in-hospital ischemic stroke mortality derived and validated within the Get with the Guidelines-Stroke Program. Circulation. 2010;122:1496–1504. [PubMed]
13. Weimar C., Ziegler A., König I.R., Diener H.C. Predicting functional outcome and survival after acute ischemic stroke. J Neurol. 2002;249:888–895. [PubMed]
14. Weimar C., König I.R., Kraywinkel K., Ziegler A., Diener H.C., German Stroke Study Collaboration Age and national institutes of health stroke scale score within 6 hours after onset are accurate predictors of outcome after cerebral ischemia: development and external validation of prognostic models. Stroke. 2004;35:158–162. [PubMed]
15. Wang Y., Lim L.L., Levi C., Heller R.F., Fischer J. A prognostic index for 30-day mortality after stroke. J Clin Epidemiol. 2001;54:766–773. [PubMed]
16. Wang Y., Lim L.L., Heller R.F., Fisher J., Levi C.R. A prediction model of 1-year mortality for acute ischemic stroke patients. Arch Phys Med Rehabil. 2003;84:1006–1011. [PubMed]
17. Nagai A., Hirata M., Kamatani Y. Overview of the BioBank Japan project: study deign and profiles. J Epidemiol. 2017;27:S2–S8. [PubMed]
18. Hirata M., Kamatani Y., Nagai A. Cross-sectional analysis of BioBank Japan clinical data: a large cohort of 200,000 patients with 47 common diseases. J Epidemiol. 2017;27:S9–S21. [PubMed]
19. Hirata M., Nagai A., Kamatani Y. Overview of BioBank Japan follow-up data in 32 diseases. J Epidemiol. 2017;27:S22–S28. [PubMed]
20. Pencina M.J., D'Agostino R.B. Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation. Stat Med. 2004;23:2109–2123. [PubMed]
21. D'Agostino R.B., Nam B. vol. 23. Elsevier; Amsterdam, The Netherlands: 2004. Evaluation of the performance of survival analysis models: discrimination and calibration measures; pp. 1–25. (Handbook of Statistics).
22. Sullivan L.M., Massaro J.M., D'Agostino R.B., Sr. Presentation of multivariate data for clinical use: the Framingham Study risk score functions. Stat Med. 2004;23:1631–1660. [PubMed]
23. D'Agostino R.B., Sr., Vasan R.S., Pencina M.J. General cardiovascular risk profile for use in primary care: the Framingham Heart Study. Circulation. 2008;117:743–753. [PubMed]
24. NIPPON DATA80 Research Group Risk assessment chart for death from cardiovascular disease based on a 19-year follow-up study of a Japanese representative population: NIPPON DATA80. Circ J. 2006;70:1249–1255. [PubMed]
25. Arima H., Yonemoto K., Doi Y. Development and validation of a cardiovascular risk prediction model for Japanese: the Hisayama Study. Hypertens Res. 2009;32:1119–1122. [PubMed]
26. Sacco R.L., Benjamin E.J., Broderick J.P. Risk factors. Stroke. 1997;28:1507–1517. [PubMed]
27. Jacobs D., Blackburn H., Higgins M. Report of the conference on low blood cholesterol: mortality associations. Circulation. 1992;86:1046–1060. [PubMed]
28. Iso H., Naito Y., Kitamura A. Serum total cholesterol and mortality in a Japanese population. J Clin Epidemiol. 1994;47:961–969. [PubMed]
29. Matsuzaki M., Kita T., Mabuchi H. Large scale cohort study of the relationship between serum cholesterol concentration and coronary events with low-dose simvastatin therapy in Japanese patients with hypercholesterolemia. Circ J. 2002;66:1087–1095. [PubMed]
30. Kiyohara Y., Kubo M., Kato I. Ten-year prognosis of stroke and risk factors for death in a Japanese community: the Hisayama Study. Stroke. 2003;34:2343–2347. [PubMed]
31. Olsen T.S., Dehlendorff C., Petersen H.G., Andersen K.K. Body mass index and poststroke mortality. Neuroepidemiology. 2008;30:93–100. [PubMed]
32. Vemmos K., Ntaios G., Spengos K. Association between obesity and mortality after acute first-ever stroke: the obesity-stroke paradox. Stroke. 2011;42:30–36. [PubMed]
33. Doehner W., Schenkel J., Anker S.D., Springer J., Audebert H.J. Overweight and obesity are associated with improved survival, functional outcome, and stroke recurrence after acute stroke or transient ischaemic attack: observations from the TEMPiS trial. Eur Heart J. 2013;34:268–277. [PubMed]

Articles from Journal of Epidemiology are provided here courtesy of Elsevier