|Home | About | Journals | Submit | Contact Us | Français|
We sought to validate a recently published risk algorithm for incident atrial fibrillation (AF) in independent cohorts and other race/ethnic groups.
We evaluated the performance of a Framingham Heart Study (FHS)-derived risk algorithm modified for 5-year incidence of AF in the FHS (n=4764 participants) and two geographically and ethnically diverse cohorts: AGES (Age, Gene/Environment Susceptibility-Reykjavik Study, n=4238), and CHS (Cardiovascular Health Study, n=5410 of whom 874 (16.2%) were African Americans (AA)); aged 45–95 years. The risk algorithm included age, sex, body mass index, systolic blood pressure, electrocardiographic PR-interval, hypertension treatment, and heart failure.
We observed 1359 incident AF events in 100,074 person-years of follow-up. Unadjusted five-year event-rates differed by cohort (AGES 12.8 cases/1000 person-years; CHS whites 22.7 cases/1000 person-years; FHS 4.5 cases/1000 person-years) and race/ethnicity (CHS AA 18.4 cases/1000 person-years).
The strongest risk factors in all samples were age and heart failure. The relative risks for incident AF associated with risk factors were comparable across cohorts and race groups. After recalibration for baseline incidence and risk factor distribution, the Framingham algorithm performed reasonably well in all samples (AGES C-statistic 0.67, 95% confidence interval 0.64–0.71; CHS whites, 0.68, 0.66–0.70; CHS AA 0.66, 0.61–0.71). Risk factors combined in the algorithm explained between 47.0% (AGES) and 63.6% (FHS) of the population attributable risk.
Risk of incident AF in community-dwelling whites and AA can be assessed reliably by routinely available and potentially modifiable clinical variables. Seven risk factors accounted for up to 64% percent of risk.
The prevalence and incidence of atrial fibrillation (AF) have been increasing over the last several decades.1,2 The improved assessment of risk for incident AF was formulated as one of the major goals of a recently convened National Heart, Lung, and Blood Institute workshop.3 A risk algorithm based on readily available clinical variables for 10-year incidence of AF in Framingham Heart Study (FHS) participants (http://www.framinghamheartstudy.org/risk/index.html) has been published.4 Transportability to independent cohorts and other ethnicities with different incidence rates and distributions of risk factors has to be shown before general recommendations for the use of the risk algorithm can be given. In particular, in African Americans (AA), a paradoxically low prevalence of AF has consistently been reported despite a high risk factor burden.5,6 Thus, it is important to understand how the classical risk factors for AF combined in a risk prediction algorithm are associated with risk in African Americans.
We tested a risk algorithm for AF incidence developed in the Framingham Heart Study in two large, independent community-based cohorts from the US (Cardiovascular Health Study, CHS) and Europe (Age, Gene/Environment Susceptibility-Reykjavik Study, AGES). In CHS we had the opportunity to examine risk factor prevalence and association with incident AF in whites and AA. An accurate risk assessment tool is necessary to address the increasing burden of AF in the community by facilitating the identification of individuals at increased absolute risk to potentially target for intervention trials. With the current project we intended to take the second step of a risk algorithm implementation: the validation of the risk function in samples independent of the derivation cohort.
Overall, we examined data from n=14,412 individuals (AGES, n=4238; CHS, n=5410; and FHS, n=4764); FHS was the derivation sample. Participants were excluded if individuals were aged <45 or >95 years at baseline, had prevalent AF at baseline, or were missing data for any of the following risk factors: age, sex, body mass index, systolic blood pressure, treatment for hypertension, electrocardiographic PR interval, or history of heart failure. All studies were approved by institutional review boards from the participating institutions. All participants provided written informed consent.
Data from the Age, Gene/Environment Susceptibility Reykjavik Study (AGES) were based on men and women recruited between 2002 and 2006 (N=5,764). The participants were the survivors of the Reykjavik Study, which was conducted between 1967 and 1996. All men and women (N=30,795) living in the greater Reykjavik area, and born in 1907–1935, were selected into the Reykjavik Study cohort and a random sample invited (5/6 of the cohort). The response rate was 71% (N=19,381).7
Standard examination protocols and questionnaires were performed in the AGES study. Clinic visits included anthropometry, blood pressure measurement, electrocardiogram, and measures of different physical and cognitive function domains. The diagnosis of heart failure was based on hospital discharge records. Physical examination for valvular heart disease was not performed. Information on vital events and cardiovascular disease has been continuously recorded since study inception supplemented by registries of vital status and cardiovascular disease; hospital records with International Classification of Diseases (ICD), ninth revision, and International Statistical Classification of Diseases and Related Health Problems, tenth revision codes. Prevalent AF or atrial flutter was diagnosed at the AGES visit or by ICD9 and ICD10 codes on hospital admission before the AGES visit. Incident AF was identified by hospital admission ICD10 code.
The Cardiovascular Health Study (CHS) is an observational cohort study of risk factors for coronary disease and stroke in the elderly.8 In 1989–1990, four field centers recruited a total of 5201 people 65 years of age or older from Medicare eligibility lists in four communities (Forsyth County, NC; Sacramento County, CA; Washington County, MD; Pittsburgh, PA) in the United States. To enhance minority representation, in 1992–1993, 687 AA participants were recruited in three of the four field centers.
Participants had annual examinations including assessment of cardiovascular risk factors, prior cardiovascular disease, medications, height, weight, seated blood pressure, and a 12-lead electrocardiogram through 1999. At the baseline examination in 1989–1990 only, cardiac murmur was recorded and recorded as any diastolic or systolic murmur. Racial/ethnic origin was by self-report. Because most of the AA participants were recruited in 1992–1993, they did not have cardiac auscultation at baseline. A history of heart failure at baseline was defined by signs, symptoms, clinical tests, physician diagnosis, and/or medical therapy.9
As for the derivation sample4 for the AF risk algorithm, participants from the middle-aged to elderly white Framingham Heart Study (FHS) Original (examination cycle 11, n=2955 and 17, n=2179) and Offspring (examination cycle 1 n=5124, and 3, n=3873) cohorts were eligible. Standardized physician-administered questionnaires provided information on risk factors, medications and health behaviors. Anthropometric measures, blood pressures and 12-lead electrocardiograms were taken at every FHS clinic visit. Valvular heart disease was diagnosed by physician-auscultated grade ≥3 out of 6 systolic or any diastolic murmur. Information on cardiovascular outcomes and medications was updated by regular questionnaires during FHS clinic visits and biennial health updates. In case of cardiovascular events, including heart failure, outpatient charts and hospital discharge records were collected and underwent adjudication by Framingham physicians based on previously published clinical criteria.10 Ten-year follow-up information was used to derive the previously published AF risk algorithm.4
In the three cohorts, incident AF was diagnosed on the date that atrial fibrillation or atrial flutter was first present on study electrocardiogram, at the date of hospital admission if an ICD-9 or ICD-10 discharge diagnosis code for AF or atrial flutter was assigned, or in FHS only, if sufficient evidence was available for AF based on hospital or general practitioner records, according to expert opinion. Prevalent atrial fibrillation was based on the diagnosis of atrial fibrillation at or prior to the baseline examination. AF ascertainment took place between 2002–2008 in AGES, 1989–2005 in CHS and 1968–1992 in FHS.
Risk factor selection was based on the recently-published AF risk algorithm developed in the FHS and included age, sex, body mass index, systolic blood pressure, hypertension treatment, electrocardiographic PR-interval, and prevalent heart failure.4 Descriptive statistics were produced for each cohort (and each racial group) considered separately. Five-year (and 10-year in Framingham and CHS) estimates of AF rates were produced for each cohort using the Kaplan Meier method.
Due to shorter follow-up periods in AGES, we re-estimated the model in FHS to assess risk factors for incident AF over 5 years and truncated FHS follow-up at 5 years; death and event-free follow-up of more than 5 years were censoring events. In CHS and FHS, if follow-up was available for more than 11 years, multiple 5-year intervals for individuals were included if all of the inclusion and none of the exclusion criteria were met. It has been shown that pooled Cox models can be assumed to be robust.11 The proportionality of hazards assumption was not violated.
Model performance was examined in several steps. First, each cohort estimated a Cox proportional hazards function relating incident AF (over 5 years follow-up) to the following risk factors derived from the published Framingham risk algorithm to achieve the Cox model for that cohort:4 age, age2, male sex, body mass index, current treatment for hypertension, PR-interval, history of heart failure, male sex*age2, and age*history of heart failure. Two terms in the original FHS risk function--valvular heart disease and age*valvular heart disease--were not included in this Cox model because data on valvular heart disease at baseline were missing for all AGES participants and for most CHS AA participants. We calculated a model based on Framingham data relating 5-year incidence of AF to risk factors excluding valvular heart disease. In a second step, this Framingham 5-year AF risk function (including the beta estimates, baseline incidence, and risk factor mean values from the new Framingham risk function) was then applied to each cohort to produce estimates of 5-year risk of AF in each cohort. In the final step, we accounted for the respective cohort’s baseline survival and risk factor means to improve model fit in an adjusted model.12
In the different cohorts and separately for whites and AA in CHS, model discrimination was estimated by C statistics. Calibration was assessed by agreement between predicted and observed 5-year event rates in deciles of predicted risk using a modified Hosmer-Lemeshow chi-square statistic for survival analysis.13
Population attributable risk was calculated for the risk factors combined in the risk algorithm using the previously applied risk categories of <5% (referent), 5–10% and >10% risk. We used the approach described by Hanley et al. to derive the population attributable fraction.14 Statistical analyses were conducted using SAS version 9.1 (SAS Institute: Cary, North Carolina) and Stata version 10 (StataCorp, College Station, TX). A two-sided P<0.05 was assumed to show statistical significance.
Secondary analyses were performed using 10-year follow-up intervals in CHS whites only, and included the valvular heart disease variable defined by cardiac murmur at baseline to provide a direct comparison to the original FHS AF risk function. In exploratory analyses in FHS, we also examined whether a simpler, easier-to-interpret model without the interaction terms (interactions for age and sex) performed equivalently to the published Framingham risk algorithm. In addition, we explored whether there were non-linear associations with age using a general additive model and spline functions.
The number of individuals excluded because of prevalent AF was n=927 (AGES, n=568; CHS whites, n=144; CHS AA, n=13; FHS, n=202). Further reasons for exclusion by cohort are detailed in Supplementary Figure 1. Baseline characteristics are provided in Table 1 for the study cohorts with a total of 22,088 observations on 14,412 participants over 100,074 person-years of follow-up; in CHS, AA and whites are presented separately. AGES and CHS were similar in mean age, but the mean age in FHS was approximately 15 years younger. The number of events observed during 5-year follow-up intervals was 226 in AGES, 832 in CHS whites, 126 in CHS AA, and 175 in Framingham. The percentage of males ranged from 35.8% in CHS AA to 44.6% in FHS. The unadjusted prevalence of most of the risk factors was highest in CHS AA (e.g. high body mass index, heart failure, long PR-interval) except for systolic blood pressure and hypertension treatment, which were highest in the older AGES cohort. The unadjusted 5-year AF incidence rates differed across cohorts with the highest incidence observed in CHS whites (22.7 cases/1000 person-years) and lowest incidence rate in FHS (4.5 cases/1000 person-years). Cumulative incidence rates across cohorts by age are displayed in Figure 1.
Age- and sex-adjusted Cox models (Table 2) revealed a similar strength of association of the different risk factors across cohorts and race. The strongest single risk factors were age with an approximately 2-fold increase in AF incidence per decade, and prevalent heart failure with an almost three-fold higher risk. In AA, male sex and PR-interval did not reach statistical significance, but showed point estimates comparable to the results in whites.
The best Cox models (calibration and discrimination) using the covariates established by Framingham based on the results developed from the respective study’s own data reached a C-statistic of 0.68 in all cohorts (Table 3). As expected, discrimination and calibration was best in the FHS derivation sample. In the replication cohorts, Cox models using the Framingham risk function without modifications exhibited lower discrimination statistics but were improved by recalibration for the higher baseline incidence rates and mean risk factor distributions. The C-statistic point estimates after adjustment were 0.67 (AGES) 0.68 (CHS whites), and 0.66 (CHS AA), and were similar to the C-statistics for the model developed from each study’s own data. There was no statistical difference between the C-statistic of CHS whites and AA, P=0.47. Calibration was good in AGES and CHS AA. In CHS whites, the unadjusted X2 statistic was high (456.0), but improved after adjustment for the study’s means of risk factors and baseline survival.
Compared to individuals in the lowest risk category (<5% 5-year risk of AF), participants in the category with >10% risk of developing AF had an up to 7.5-fold higher risk and contributed between 27.9% (AGES) to 50.7% (CHS whites) of the population attributable risk (Table 4). The seven risk factors combined constituted between 47.0% (AGES) and 63.6% (FHS) of the population attributable risk of AF in whites and 58.8% in AA.
For 10-year AF incidence, Cox proportional hazards regression coefficients of the respective studies are provided in Supplementary Table 2, and discrimination and calibration statistics in Supplementary Table 3.
The elimination of the age-squared term and the interactions of the original model in FHS did not influence the discrimination statistics but slightly reduced the calibration of the model. Optimal model fit was achieved by leaving the interactions in the function. We tested the shape of association of age with AF and did not observe a significant non-linear term in FHS data (Supplementary Figure 2).
We validated a risk function for the prediction of incident AF, which was originally developed in the middle-aged to elderly Framingham cohort of white Americans, in two large independent studies from the US and Europe. The risk algorithm worked reasonably well for 5-year risk prediction after calibration for the underlying event rates. We were able to extend these findings to AA. The hazard ratios for specific risk factors were comparable across cohorts. Discrimination of the Framingham AF risk model was consistent across groups and calibration was satisfactory after adjustment. The risk algorithm may thus provide a tool applicable across a broad range of individuals at risk for AF.
The AF incidence observed across studies showed differences that may be explained by several factors. First, the age structure varied across cohorts, with FHS being the youngest cohort. In secondary analyses we investigated whether the relation of age with incident AF deviates from linearity, but failed to discover non-linear associations over the age range (45–95 years) examined. Second, the years during which AF was ascertained differed by cohort: AF was ascertained between the 1960s and early 1990s in FHS, but during 1989–2008 in CHS and AGES. There may be secular trends in the diagnosis and coding of AF that favor increased recognition of this arrhythmia in more recent years. Third, CHS and Framingham had more vigorous ascertainment of AF cases than AGES, which relied on hospital discharge diagnoses, perhaps leading to greater misclassification of AF cases. Finally, we demonstrated lower AF incidence in AA compared with their white CHS counterparts of the same age distribution, a finding that is in accordance with prior observations of lower AF prevalence in AA.15,16
Previous replication attempts of Framingham risk scores for coronary heart disease events revealed good reproducibility, both in similarly-structured as well as less-comparable cohorts and different ethnic groups.12,17,18 Due to differing cohort characteristics and baseline event rates in other samples, recalibration is usually necessary to achieve better model fit, as was observed in the current analysis. The age distribution in FHS and the other two cohorts also provides a likely explanation for the difference in discrimination observed across cohorts. Recalibration and adjustment for baseline survival in the respective cohorts is another way of accounting for differences in baseline characteristics of the samples. In secondary analyses, we examined whether a more parsimonious model without the interaction terms for age and sex would simplify the risk function. The elimination of these additional terms did not change the discrimination ability of the model, as was expected,12 but reduced the calibration performance. For this reason, we recommend leaving the age-squared and interaction terms in the algorithm. Overall, the algorithm performed well with good calibration and discrimination underlining the central role of risk factors such as age, sex, elevated blood pressure, and heart failure.16,19,20
We were able to confirm the role of electrocardiographic PR-interval as an AF risk factor. Atrial conduction defects have been suggested to constitute precursors of a reduced threshold for AF21,22 and the knowledge on abnormalities in atrial electrical activity may help to better understand the pathophysiology of imminent AF.23
Important from the perspective of primary prevention is that risk factors such as body mass index, high blood pressure, and heart failure are modifiable or treatable and thus accessible to intervention. They may thus provide direct targets for prevention of AF or, at least, the delay of disease onset.
Although not all risk factors reached statistical significance in age- and sex-adjusted models due to a small sample size and resulting wide confidence intervals, the point estimates for the hazard ratios in AA were similar to those in whites. The risk algorithm performed similarly in both races. The distribution of risk factors for AF in AA was similar or even higher for unfavorable risk factors compared to whites, confirming earlier reports.15,24 For example, hypertension as one of the major predictors of AF in whites was more frequent in AA15,24 and revealed hazard ratios for AF comparable to whites as shown by our data but did not translate into a higher AF incidence.
Regarding the risk factor associations and distribution, our data suggest differences in incidence rate of AF in AA rather than a completely different set of variables that account for AF risk. However, additional factors may be responsible for differences of AF risk between races and need to be identified and evaluated. Genetic association studies, for example, may show whether there is a genetically-determined predisposition to AF beyond classical AF risk factors.
The comparable strength of risk factors in different ethnicities emphasizes their central importance and potential direct role in the disease process. Similar risk factors in both sexes and different ethnicities may facilitate risk communication and the development of uniform concepts of prevention. Similar to other risk algorithms, the risk score may help to identify individuals at high risk for AF and at the same time provide a starting-point for active prevention since some of the clinical risk factors included in the algorithm are modifiable. Whether the risk function can be applied effectively for the identification of participants for clinical intervention trials needs to be examined.
We acknowledge several limitations to our study. Observed differences in AF incidence beyond different age ranges and real incidence differences in the cohorts may be due to secular trends in the diagnosis and coding of AF and to differences in AF adjudication and intensity of collection of follow-up data. The performance of the risk algorithm indicates that the risk function seems to be robust against minor systematic misclassifications and real sample-specific differences.
Unfortunately, information on valvular heart disease, one of the strongest risk factors for atrial fibrillation, was not available in AGES and in most of the CHS AA. Reliance on physical examination for heart murmur (versus echocardiography) may have led to misclassification of valvular heart disease in both CHS whites and FHS. Whereas in FHS significant valvular heart disease was considered in a graded fashion, heart murmur was classified as present-versus-absent in CHS. The different classification reduces the comparability of the two studies, as is evident in the different prevalence and smaller hazard ratio related to cardiac murmur in CHS. Severe valvular heart disease is uncommon (<5% prevalence) in the community. Although its diagnosis is associated with a high relative risk, the population attributable risk is low, which may help to explain why the risk algorithm achieved similar accuracy to the FHS function in the replication samples even without the valvular heart disease variable. Similarly, the definition of heart failure and thus the baseline prevalence differed between cohorts. Rigorously-adjudicated heart failure events in CHS and Framingham compared with hospital discharge diagnoses in AGES may have led to somewhat different relations between heart failure and AF.9 Again, at the community level, the prevalence of heart failure was low and the slightly different definitions did not impair discrimination and calibration markedly. Overall, only the prospective application of the risk algorithm and the development of effective strategies to prevent AF will provide support for the utility of the risk function.
The utility of a risk prediction algorithm ultimately depends on several factors, including whether or not: 1) the algorithm accurately classifies individual risk; 2) effective preventive therapies for AF are available; and 3) targeting preventive therapies to level of risk improves outcome in a cost-effective way. The present study is an effort to develop a robust transportable prediction instrument. Prior to demonstrating improved outcomes, the risk prediction instrument may be useful to identify high-risk individuals for primary prevention trials, or as a screen to identify whether putative biological or genetic markers aid in risk stratification over and above easily assessed clinical factors.
We have demonstrated that an individual’s absolute AF risk can reliably be assessed in independent, community-based samples of different age structure and ethnic background based on easily-accessible clinical variables. It needs to be shown whether the application of the risk algorithm and the knowledge of the relative importance of the potentially modifiable risk factors can reduce the number of incident AF cases.
Susan Heckbert and all co-authors had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Susan Heckbert had final responsibility for the decision to submit for publication.
Sources of funding
The AGES-Reykjavik study was funded by National Institutes of Health contract N01-AG-12100, the National Institute on Aging Intramural Research Program, Hjartavernd (the Icelandic Heart Association), and the Althingi (the Icelandic Parliament).
The CHS research reported in this article was supported by N01-HC-85079 through N01-HC-85086, N01-HC-35129, N01 HC-15103, N01 HC-55222, U01 HL080295, R01 HL068986, and R01 HL087652 from the National Heart, Lung, and Blood Institute, with additional contribution from the National Institute of Neurological Disorders and Stroke. A full list of principal CHS investigators and institutions can be found at http://www.chs-nhlbi.org/pi.htm. FHS research was supported by NIH/NHLBI contract N01-HC-25195 and NIH grants AG028321, AG029451; HL092577; RC1 HL101056; R01 NS 17950. NIH Research career award 2K24 HL04334; Deutsche Forschungsgemeinschaft (German Research Foundation) Research Fellowship SCHN 1149/1-1.
There are no conflicts of interest to be reported by the authors.