PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of canjcardiolThe Canadian Journal of Cardiology HomepageSubscription pageSubmissions Pagewww.pulsus.comThe Canadian Journal of Cardiology
 
Can J Cardiol. 2009 July; 25(7): e232–e235.
PMCID: PMC2723032

Language: English | French

Multiple imputation for missing cardiac magnetic resonance imaging data: Results from the Multi-Ethnic Study of Atherosclerosis (MESA)

Abstract

BACKGROUND:

Cardiac magnetic resonance imaging (MRI) is a non-invasive technique used to accurately and reproducibly measure biological parameters such as left ventricular mass. However, some subjects either refuse or are unable to complete testing, and the impact of excluding these missing data from predictive models is unknown.

METHODS:

Multiple imputation was applied to cardiac MRI data that were previously analyzed using a complete case approach. The model variables – 10 traditional cardiovascular risk factors and five sociodemographic variables – were used as a basis for imputation. Men and women were imputed separately. The primary focus was assessing the change in the cardiovascular predictors of left ventricular geometry and systolic function.

RESULTS:

Although 27% of participants were missing cardiac MRI data, multiple imputation returned results similar to those of a complete case analysis. These results were robust to the point of including additional variables in the imputation analysis above and beyond the model variables. The degree of variance explained by the models increased marginally but the statistical inference was altered for only two predictors out of 53 cardiovascular risk factors using multiple imputation.

DISCUSSION:

The results suggest that the cardiac MRI data in the Multi-Ethnic Study of Atherosclerosis (MESA) do not substantively change when missing data are handled using multiple imputation. Future analyses of cardiac MRI data may consider the complete case approach to be adequate despite the high rate of missing data in this population.

Keywords: Comparison of methods, Complete case, Magnetic resonance imaging, Multiple imputation

Résumé

HISTORIQUE :

L’imagerie par résonance magnétique (IRM) est une technique non effractive utilisée pour mesurer de façon précise et reproductible certains paramètres biologiques, tels que la masse ventriculaire gauche. Or, il arrive que des patients refusent ou sont incapables de mener le test à terme et on ignore quel est l’impact de l’exclusion de ces données manquantes des modèles prédictifs.

MÉTHODES :

Une technique d’imputation multiple a été appliquée aux données d’IRM cardiaques préalablement analysées à l’aide d’une approche par ensembles complets de données. Les variables du modèle, dix facteurs de risque cardiovasculaires classiques et cinq variables sociodémographiques, ont servi de base pour l’imputation. Les hommes et les femmes ont fait l’objet d’imputations distinctes. L’objectif principal était d’évaluer le changement des prédicteurs cardiovasculaires de la géométrie et de la fonction systolique ventriculaires gauches.

RÉSULTATS :

Bien que 27 % des participants aient présenté des données d’IRM cardiaque partielles, l’imputation multiple a généré des résultats semblables à ceux des sujets dont l’analyse des données était complète. Ces résultats se sont révélés robustes au point d’inclure des variables additionnelles dans l’analyse d’imputation en plus des variables du modèle. Le degré de variance expliqué par les modèles a augmenté accessoirement, mais l’inférence statistique a été modifiée pour seulement deux prédicteurs sur 53 facteurs de risque cardiovasculaires à l’aide de l’imputation multiple.

DISCUSSION :

Les résultats donnent à penser que les données d’IRM cardiaque de l’étude MESA (Multi-Ethnic Study of Atherosclerosis) ne changent pas substantiellement lorsque les données manquantes sont manipulées par imputation multiple. Les futures analyses des données d’IRM cardiaque pourraient considérer que l’approche par cas complet serait adéquate malgré le taux élevé de données manquantes dans cette population.

Cardiac magnetic resonance imaging (MRI) is a noninvasive technique used to accurately and reproducibly measure biological parameters such as left ventricular (LV) mass (1). The Multi-Ethnic Study of Atherosclerosis (MESA) (2) described the first large-scale application of cardiac MRI in a multiethnic cohort study.

However, a nontrivial number of participants did not undergo an MRI due to reasons such as claustrophobia, metal objects in the body or refusal to undergo the procedure (1). Of the 6814 MESA participants, 5004 (73%) completed the cardiac MRI procedure and had technically adequate data. This level of missing data is above the threshold in which missing data in other medical tests have been shown to introduce biased results (3). Therefore, it is important to assess whether the missing data have an impact on the results obtained from cardiac MRI data.

In cases of an important amount of missing data, the most robust current approach is multiple imputation, which tends to outperform alternative methods (47). However, in some cases, the reasons for missing data may be unrelated to the study outcome or the predictors of the outcome. In these cases, in which the missing data are completely random, less sophisticated approaches, such as the complete cases analysis, may be equally valid (5).

The goal of the present study was to compare the results of the test for the association of traditional cardiovascular risk factors with cardiac MRI results with and without the use of multiple imputation to account for missing data.

METHODS

MESA was a longitudinal, population-based study of 6814 men and women 45 to 84 years of age, from four distinct ethnic groups (African American, Asian, Caucasian and Hispanic). These participants, all without a clinically recognized cardiovascular disease (CVD) at baseline, underwent evaluations of demographics, risk factors and degree of subclinical CVD (2). The MESA study intended to detect risk factors that predict the transition from subclinical CVD to overt CVD in a multiethnic cohort.

The study was based on all 6814 MESA participants at baseline, of whom, 5004 had documented cardiac MRI data. Of the 5004 participants with cardiac MRI data, 4888 (98%) had no missing information on cardiac risk factors and were therefore suitable for a complete case analysis. There were no partially missing cardiac MRI data and all participants with any cardiac MRI measure had all measures available. The number of participants with missing data other than MRI data was so low (2%) that all missing data techniques should provide equivalent estimates (3). The cohort analyzed in the present study was previously described (1), as were the imaging techniques used (8).

To make the present study’s results as comparable as possible to previous work, the models used by Heckbert et al (1), who considered a complete case approach to cardiac MRI data, were used. Heckbert et al estimated the demographic and cardiovascular predictors of LV geometry and function. Comparing results of the present study directly with those of this previously published paper enabled a contrast to be made between the results obtained from multiple imputation and the previous work. All MRI outcome data for LV mass, LV end-diastolic volume, LV stroke volume, LV ejection fraction and cardiac output were assessed for normality to determine whether a transformation of the data was required to support using a multivariate normal approach to the imputation.

All models of LV geometry and function were estimated using linear regression, using the same approach as in the initial report by Heckbert et al (1). Imputation was conducted separately by sex because there was some evidence of effect modification. In imputing each model by sex, all interactions were implicitly included by sex into the imputation model. The inclusion of effect modification was rejected by Heckbert et al (1) after a Bonferroni correction for multiple testing (9). However, over-specification of an imputation model may improve parameter estimates (10) and, in general, it is common to use a more richly specified imputation model than the final analytical model (7,10).

The primary multiple imputation was considered to be based on the same variables used as possible predictors of the MRI outcome data, which included the following sociodemographic variables and cardiovascular risk factors: age, sex, race/ethnicity, clinic site, height, systolic blood pressure, diastolic blood pressure, current smoking, alcohol use, exercise, body mass index, low-density lipoprotein cholesterol, high-density lipoprotein cholesterol, impaired fasting glucose and diabetes. An expanded multiple imputation model was then considered, which also included family history of heart attack or stroke, high school education or less, the Framingham risk score for heart disease, waist circumference, serum and urine creatinine, blood glucose, heart rate, urine albumin, triglycerides, common carotid intima-media thickness, an emotional and social support index, and the Spielberger trait anger (11) and anxiety (13) scales. Finally, all classes of medications (n=45) used at baseline by at least 1% of the study population were considered in the imputation model.

The multiple imputation was performed using SAS version 9.1.3 (SAS Institute Inc, USA). Mixed-chain imputation was used with 1000 burn-in iterations and the Markov chain Monte Carlo option. Time plots and autoregression plots were assessed as diagnostic checks. Results were imputed separately for men and women, and included all variables (including outcome variables) in the imputation step (13). Ten imputed datasets were used to ensure that the effect estimates were not overly inaccurate due to Monte Carlo variability (14).

RESULTS

Of the 1810 participants with missing cardiac MRI data, data were missing due to ineligibility (28%; usually because of metallic fragment, implant or device), inability (55%; usually because of claustrophobia), refusal (12%), mechanical problem with the scanner (2%) or unknown (4%) (1).

The distribution of risk factors suggested that the participants who did not complete a cardiac MRI scan were at higher cardiovascular risk (Table 1). The results for the mean of the cardiac MRI variables were similar between the complete case and multiple imputation approaches (Table 2) with less than a 10% difference between the approaches for almost all estimates. This similarity continued even when the number of variables used in the imputation was dramatically increased. Small differences were seen in LV mass in men when the results were pooled by sex and no longer considered effect modification by sex in the imputation.

TABLE 1
Estimates of cardiovascular risk factors based on cardiac magnetic resonance imaging (MRI): Results from the Multi-Ethnic Study of Atherosclerosis (MESA) from 2000 to 2002
TABLE 2
Means for cardiac magnetic resonance imaging variables for men and women at baseline in the Multi-Ethnic Study of Atherosclerosis (MESA) from 2000 to 2002

The effect of imputation on the predictors of LV geometry and function was small (Tables 3 and and4).4). There was a small increase in the amount of variance explained by the model for four of the five measures of LV geometry and function. Of the 53 predictors tested across the five outcome variables, only two changes in inference occurred when missing data were handled using multiple imputation (diastolic blood pressure was no longer significantly associated with LV mass and systolic blood pressure squared became a significant predictor of LV ejection fraction). Both changes in inference were in marginal predictors and the point estimates were similar (Tables 3 and and44).

TABLE 3
Multivariable analysis of traditional cardiovascular risk factors in relation to left ventricular (LV) mass and volume
Table 4
Multivariable analysis of traditional cardiovascular risk factors in relation to left ventricular (LV) ejection fraction and cardiac output;

DISCUSSION

Multiple imputation is the gold standard approach for handling missing data (5) but it involves dealing with additional analytical complexity. A complete case analysis can be more transparent and has the advantage of relying solely on the data actually observed. In the present example, an analysis using multiple imputation did not change any of the inference around the predictors of LV mass, meaning that the main goal of the previous analysis could be met with a complete case approach (1). It is important to note that the similarity of the estimates was preserved even when the number of variables used in the imputation model was greatly increased compared with the variables used in the statistical models, and different types of variables (eg, psychological) were added.

These findings are important because the covariates in Table 1 suggest that there is a relationship between cardiovascular health and missing cardiac MRI data. If having missing data is also related to LV geometry and function, this may have introduced confounding between the predictors and the outcome. However, as shown in Table 2, the imputed values are quite similar to the measured values and this is verified by the very small changes in the estimates seen in Tables 3 and and4.4. Verifying this, however, is a key element in confirming the validity of the complete case approach for cardiac MRI data (1).

The results of the present study are similar to those found in the Cardiovascular Health Study (CHS) (15) in which LV mass (measured using electrocardiography) was missing for 35% of participants at baseline. Consistent with the CHS data, the associations in the present study were preserved whether using complete case analysis or multiple imputation as an approach for handling missing data. This lack of difference between approaches is supported by other studies that have found particular substantive examples in which multiple imputation provided similar results to less sophisticated approaches (1618).

However, it is critical to remember that the results of multiple imputation tend to be less biased than less sophisticated approaches to handling missing data (37). Therefore, when the complete case approach and multiple imputation show differences, the estimates from the imputed datasets are likely to be closer to the true values for the parameters (4). Usually, even when the missing data are not random, the estimates obtained from multiple imputation, while still biased, are less biased than other approaches (5). This suggests that the imputed estimates should be used when an important difference is observed and particular substantive areas should be checked to determine whether the approach for handling missing data may have important effects on the results.

The use of psychological scales in the imputation indirectly tested the hypothesis that psychological reasons for refusing a cardiac MRI could be a candidate for a hidden common cause of both missing data and LV geometry or function. However, the inclusion of these additional variables did not result in any advantage in the imputation because it yielded the same results as the imputation based solely on the sociodemo-graphic and cardiovascular risk factors included directly in the model.

Limitations

The present study has a number of important limitations. First, these results assume that there is no unknown or unmeasured variable that predicts both missing data and the outcome. While the estimates were robust to the point of inclusion of many additional candidate variables, it is always possible that none of these variables can serve as a proxy for this unknown variable. However, this is a common assumption of all missing data techniques (13).

Second, all members of the MESA cohort either had subclinical CVD or were disease-free at baseline. It is not clear whether these results can be generalized to populations with more severe CVD. In particular, the participants who were unable to undergo an MRI due to devices (1) may have been at greater risk for the outcome if the device was implanted to treat a clinically important manifestation of CVD.

Finally, it is possible that further differences could emerge in a longitudinal setting as opposed to the cross-sectional setting of the present study. Participants may withdraw from the study for reasons related to refusal to undergo a cardiac MRI; this is especially possible because low-compliers may be more likely to refuse to undergo a procedure and are also more likely to suffer adverse outcomes (19). We may also have observed different results if we had used different exposures to predict the results of the cardiac MRI because there may have been information in other variables that could have improved these estimates. However, attempts to further refine the variables used in the imputation in the present study (such as using finer categories for smoking and education level) did not result in detectable differences in the estimates.

CONCLUSION

The present study provides important confirmatory evidence that the inferences derived from the results of cardiac MRI data are broadly correct when missing data are handled using a complete case approach given the exposure variables considered.

Footnotes

FUNDING: This research was supported by contracts N01-HC-95159 through N01-HC-95165 and N01-HC-95169 from the National Heart, Lung, and Blood Institute. The authors thank the other investigators, the staff and the participants of the MESA study for their valuable contributions. A full list of participating MESA investigators and institutions can be found at <www.mesa-nhlbi.org>.

REFERENCES

1. Heckbert SR, Post W, Pearson GD, et al. Traditional cardiovascular risk factors in relation to left ventricular mass, volume, and systolic function by cardiac magnetic resonance imaging: The Multiethnic Study of Atherosclerosis. J Am Coll Cardiol. 2006;48:2285–92. [PMC free article] [PubMed]
2. Bild DE, Bluemke DA, Burke GL, et al. Multi-Ethnic Study of Atherosclerosis: Objectives and design. Am J Epidemiol. 2002;156:871–81. [PubMed]
3. Barzi F, Woodward M. Imputations of missing values in practice: Results from imputations of serum cholesterol in 28 cohort studies. Am J Epidemiol. 2004;160:34–45. [PubMed]
4. Greenland S, Finkle WD. A critical look at methods for handling missing covariates in epidemiologic regression analyses. Am J Epidemiol. 1995;142:1255–64. [PubMed]
5. Schafer JL, Graham JW. Missing data: Our view of the state of the art. Psychol Methods. 2002;7:147–77. [PubMed]
6. van der Heijden GJ, Donders AR, Stijnen T, Moons KG. Imputation of missing values is superior to complete case analysis and the missing-indicator method in multivariable diagnostic research: A clinical example. J Clin Epidemiol. 2006;59:1102–9. [PubMed]
7. Rubin DB. Multiple imputation after 18+ years. J Am Stat Assoc. 1996;91:473–89.
8. Natori S, Lai S, Finn JP, et al. Cardiovascular function in multi-ethnic study of atherosclerosis: Normal values by age, sex, and ethnicity. AJR Am J Roentgenol. 2006;186(6 Suppl 2):S357–65. [PubMed]
9. Bender R, Lange S. Adjusting for multiple testing – when and how? J Clin Epidemiol. 2001;54:343–9. [PubMed]
10. Meng XL. Multiple-imputation inferences with uncongenial sources of input (with discussion) Statist Sci. 1994;10:538–73.
11. Spielberger CD, Johnson EH, Russell SF, Crane RJ, Jacobs GA, Worden TJ. The experience and expression of anger: Construction and validation of an anger expression scale. In: Chesney MA, Rosenman RH, editors. Anger and Hostility in Cardiovascular and Behavioral Medicine. Washington: Hemisphere; 1985.
12. Spielberger CD. Manual for the State-Trait Anxiety Inventory. Palo Alto: Consulting Psychologists Press; 1983.
13. Moons KG, Donders RA, Stijnen T, Harrell FE., Jr Using the outcome for imputation of missing predictor values was preferred. J Clin Epidemiol. 2006;59:1092–101. [PubMed]
14. Schafer JL. Multiple imputation: A primer. Stat Methods Med Res. 1999;8:3–15. [PubMed]
15. Arnold AM, Kronmal RA. Multiple imputation of baseline data in the Cardiovascular Health Study. Am J Epidemiol. 2003;157:74–84. [PubMed]
16. Engels JM, Diehr P. Imputation of missing longitudinal data: A comparison of methods. J Clin Epidemiol. 2003;56:968–76. [PubMed]
17. Bono C, Ried LD, Kimberlin C, Vogel B. Missing data on the Center for Epidemiologic Studies Depression Scale: A comparison of 4 imputation techniques. Res Social Adm Pharm. 2007;3:1–27. [PubMed]
18. Kang T, Kraft P, Gauderman WJ, Thomas D, Framingham Heart Study Multiple imputation methods for longitudinal blood pressure measurements from the Framingham Heart Study. BMC Genet. 2003;4(Suppl 1):S43. [PMC free article] [PubMed]
19. Horwitz RI, Horwitz SM. Adherence to treatment and health outcomes. Arch Intern Med. 1993;153:1863–8. [PubMed]

Articles from The Canadian Journal of Cardiology are provided here courtesy of Pulsus Group