|Home | About | Journals | Submit | Contact Us | Français|
Within-person variability in measured values of multiple risk factors can bias their associations with disease. The multivariate regression calibration approach can correct for such measurement error and has been applied to studies in which true values or independent repeat measurements of the risk factors are observed on a subsample. We extend the multivariate regression calibration techniques to a meta-analysis framework where multiple studies provide independent repeat measurements and information on disease outcome. We consider the cases where some or all studies have repeat measurements, and compare study-specific, averaged and empirical Bayes estimates of regression calibration parameters. Additionally we allow for binary covariates (e.g. smoking status) and for uncertainty and time trends in the measurement error corrections. Our methods are illustrated using a subset of individual participant data from prospective long-term studies in the Fibrinogen Studies Collaboration to assess the relationship between usual levels of plasma fibrinogen and the risk of coronary heart disease, allowing for measurement error in plasma fibrinogen and several confounders.
Many epidemiological studies aim to estimate the association between potential risk factors and the likelihood of disease. Because risk factors are usually measured with error and fluctuate within individuals, analyses that use only single measurements of the risk factors produce biased estimates of any aetiological association between average (or “usual”) risk factor levels and disease . This bias is caused by any or all of (i) measurement error, (ii) short-term within-person variation and (iii) longer-term within-person variation (e.g. due to behaviour changes) . We assume there exist “usual levels” of the risk factors which represent the true exposure of interest, and we describe any difference between measured levels and usual levels as “measurement error”. Measurement error in exposures leads to underestimation of exposure-disease associations (regression dilution bias) [2,3,4], while measurement error in confounders leads to residual confounding .
Various methods have been proposed to estimate the effect of measurement error in multiple covariates and to correct the disease associations estimated from single observations of the risk factors [1,6,7]. For continuous risk factors, the true regression coefficients in the disease model may be estimated by multiplying the vector of observed regression coefficients by the inverse of a correction matrix . The correction matrix comprises coefficients from the linear regression of the true risk factors on the observed values: we call this the “regression calibration” model. A second approach replaces the observed risk factors in the disease model with conditional expectations of the true values given the observed values, which are predicted from the regression calibration model. The regression calibration model may be estimated directly if true and observed measures are available for some subjects, or indirectly if repeat observations are available for some or all subjects. In the latter case, the regression calibration model is the regression of an unbiased repeat measurement on the first measurement .
The Correction Matrix and Conditional Expectation approaches are mathematically equivalent in simple cases and have underlying assumptions that (i) the errors in different repeat measurements are independent of each other and of the true value  and (ii) knowledge of usual levels completely captures the risk of disease associated with the risk factors (rather than change or spikes in risk factors). Under these assumptions, regression calibration exactly corrects for measurement error when the disease model is a linear regression or a Poisson regression; however, in general, it is an approximate method [10,11].
Whereas regression calibration approaches have been applied in the literature for the analysis of single studies, less work has been done in the context of individual participant data (IPD) meta-analysis of multiple studies. Earlier work [2,12,13] has focused on methods for correcting for univariate measurement error in meta-analyses. The Prospective Studies Collaboration estimated a usual risk factor level from a non-linear time-dependent regression calibration model that pooled all studies . The Fibrinogen Studies Collaboration (FSC) estimated study- and time-specific regression dilution ratios (RDRs) which were then combined allowing for within and between study heterogeneity . Both ignored error in confounders. Here, we extend the multivariate regression calibration technique to a meta-analysis framework. We consider the cases where either some or all studies have repeat measurements. First we define and compare methods using study-specific, averaged and empirical Bayes estimates of regression calibration parameters. Second, we take account of the imprecision in the measurement error corrections by adapting methods used in multiple imputation. Third, we allow for measurement error time trends by estimating time-dependent usual levels of the risk factors which can be entered into a time-dependent disease model. We also discuss the practicalities of using Correction Matrix versus Conditional Expectation approaches.
Our motivation is in providing appropriate methods for the analysis of IPD in collections of multiple epidemiological studies. By comparison with analysis of single studies, combination of data from several studies should yield greater precision and comprehensiveness. With the increasing number of such IPD collations (e.g. the Prospective Studies Collaboration , the Asia Pacific Cohort Studies Collaboration , the FSC , and the million-participant Emerging Risk Factors Collaboration ) and of purpose-designed multi-centre prospective observational studies (e.g. the 22-centre, 520,000-person European EPIC study ), it is important to develop appropriate biostatistical methods for correcting for multivariate measurement errors in IPD meta-analyses.
We illustrate our proposals on a subset of individual participant data from prospective long-term studies in the FSC, described in section 2. In section 3 we introduce our model and then discuss various practicalities in section 4. Results from the FSC data are presented in section 5. We give a discussion in section 6 and conclusions in section 7.
The Fibrinogen Studies Collaboration  is a meta-analysis of individual data on 154 211 adults from 31 prospective studies with information on plasma fibrinogen  and major disease outcomes. The FSC has previously reported moderately strong associations between plasma fibrinogen values and the risks of major vascular and non-vascular chronic disease outcomes, although the causal relevance of these associations remains uncertain . As part of the FSC data collection, information was provided on repeat measures (values recorded after the initial baseline examination) of plasma fibrinogen and other measures from subsets of participants from 15 of the studies at various re-measurement times.
For illustration purposes, we will explore the association of fibrinogen, systolic blood pressure and smoking with coronary heart disease (CHD) in five representative studies in the FSC which had repeat measures available on each of these risk factors. To remain consistent with previous FSC reports , baseline fibrinogen levels above 5.62 g/L (the highest 1% of values) are excluded due to the potential distortions arising from assay imprecision and from acute-phase reactions, although our proposed methods would be able to address such extreme values. Individuals with a known history of previous coronary heart disease or stroke at baseline, and the few participants aged below 20 years at baseline, are also excluded.
The disease models are based on Cox proportional hazards models. If we ignored measurement error, we would model the hazard for the ith individual in the sth study as
where s=1,…,S, i=1,…,Ns, and Wsi = (Wsi1, Wsi2,…, WsiP) is a P-vector of observed risk factors. The superscript in βun indicates that no adjustment for within-person variability has been made.
In order to adjust for measurement error, we might assume that the risk factors have “usual levels” Xsi = (Xsi1, Xsi2,…, XsiP). The Cox model then becomes
Alternatively, if the assumption of a constant “usual level” over the life-course is not tenable , we might prefer to define time-dependent “usual levels” Xsi(t) and the Cox model
A second-level model then links the parameters βs in different studies. Under a “fixed-effect” model, βs = β, the vector of parameters β is of interest, being the adjusted log hazard ratios per unit increase in “usual levels” of the risk factors. This model can be estimated in a single stage. Alternatively, we may consider a “random-effects” model βs ~ MVN(β, Z). Here β is the vector of average log hazard ratios, and the variance-covariance matrix Z represents heterogeneity between studies . This model is most conveniently estimated by a two-stage procedure in which the model is first estimated in each study, yielding parameter estimate s with variance-covariance matrix s, and these are then combined using standard methods.
In the initial analysis to determine the relationship between fibrinogen and cardiovascular disease , measurement error in plasma fibrinogen was corrected for using a regression dilution ratio of 0.46 estimated from a linear regression of repeat measurements on baseline values of fibrinogen in each study and for each time interval, and pooled allowing for within and between study heterogeneity . The association between plasma fibrinogen and disease was found to be moderately strong, even after adjusting for all measured confounders. However, residual confounding remains if the confounders are also measured with error. The methodological work developed in this paper was motivated by the need to correct for measurement error in multiple confounders using data from multiple studies. This paper focuses on developing the methodology rather than presenting the full applied results.
In this section we develop multivariate regression calibration models which incorporate data from multiple studies. For ease of exposition, we ignore non-error prone confounders, such as age and sex.
We first justify the use of the regression coefficients of the repeat observations on the baseline observations. If model (2) is correct then approximately unbiased estimators of βs may be obtained from a Cox regression using as covariates the conditional expectations of Xsi given the observed baseline value Wsi0 : this is because
The approximation above derives from a Taylor series, valid for small cumulative incidence . The first exact equality above (expressing non-differential error) depends on the correctness of model (2), since it assumes that the observation Wsi0 would add nothing to the model if Xsi were known. Model misspecification could invalidate the corrections we propose : that is, the corrected estimates could differ systematically from those obtained if we could fit model (2) directly to data on Xsi.
Similarly, if model (3) is correct then
so the covariates in the Cox regression must be the conditional expectations of Xsi (t) given Wsi0.
In order to estimate E[Xsi|Wsi0] or E[Xsi(t)|W|si0], we make the further assumption that any repeat measurements Wsir are unbiased measures of the true underlying value, in the sense that E[Wsir|Wsi0, Xsi] = Xsi, so that E[Xsi|Wsi0] = E[Wsir|Wsi0]. We therefore focus on estimation of the regression calibration model E[Wsir|Wsi0]. We start assuming model (2) and describe modifications for model (3) in section 4.3.
We begin with a single study with N individuals, with each individual i providing baseline measurements on P continuous risk factors, denoted by Wi0p for i=1,…, N and p=1, …, P. We assume n ≤ N individuals have exactly one repeat measure on all risk factors measured at the same time point, denoted by Wi1p for i=1, …, n and p=1, …, P. A linear multivariate regression calibration (RC) model can be written as:
where (ei1, …,eiP), ~ MVN (0,Σ).
Note that repeat measurements are regressed on the baseline measurements rather than vice-versa. Formulating the model in this manner is crucial for its applicability to individuals with only baseline measurements. Non-error prone confounders, such as baseline sex and age, can easily be incorporated in equation (6). For example, if Agei0 denotes baseline age for individual i, the term cpAgei0 can be added to the right-hand side of equation (6).
We now extend the model to allow for S studies, with study s having Ns (s=1,...., S) individuals providing baseline measures and ns individuals providing a single repeat measurement on P risk factors. Let the baseline measure for the pth risk factor (p=1, …, P) from the ith individual in study s be denoted by Wsi0p (i=1, …, Ns) and similarly let the repeat measurement be denoted by Wsi1p (i=1, …, ns). The multivariate RC model is:
where (esi1, …, esiP)T ~ MVN (0, Σ) and Σ is a (P×P) variance-covariance matrix. Variations on the covariance structure Σ are discussed in section 3.4.
We finally extend to the case when data are available from S studies with study s having Ns individuals providing baseline measurements and ns individuals providing up to Rs repeat measures on P risk factors. For simplicity we begin by assuming the error in the risk factors is time-independent, but this is relaxed in section 4.3. Let the baseline and the rth repeat measurement for the pth risk factor in study s be denoted by Wsi0p (i=1, …,Ns) and Wsirp for (i =1, …, ns, r =1,…, Rs) respectively. A linear multivariate RC model is:
where (esir1, …,esirP)T~MVN (0, Σ) and (zsi1,…,zsiP)T~MVN (0, Φ) for all r =1, …, Rs, and Σ and Φ are (P×P) variance-covariance matrices denoting residual error and individual-specific variation respectively. The latter accounts for the hierarchical structure at the individual level by allowing an individual-specific random intercept (e.g. some individuals may have consistently higher or lower repeat measurement levels of certain risk factors). Note the RC model (7) is a special case of model (8) with Rs=1. As before, non-error prone confounders can be easily incorporated in equations (7) and (8), for example, by adding the term cpAgesi0 or cspAgesi0 to the right-hand side. Variations on model (8) are discussed in section 3.4.
Usually only a subset of studies has data available on repeat measurements, as in the FSC, and it is necessary to find ways to transfer the information to the other studies. When considerable heterogeneity exists, averaged regression coefficients may not provide an appropriate correction (although they are likely to be far better than no correction), and any factors associated with the heterogeneity should be considered. The empirical Bayes regression coefficients reflect the heterogeneity in studies with repeat measures and use the averaged regression coefficients for studies without repeat measures.
Models involving (9) and/or (10) are highly multivariate and computationally complex. It may be appropriate to replace the variance-covariance matrices Σ and Φ with diagonal variance matrices, equivalent to P independent univariate models. Such models will produce correct regression coefficients for balanced data (where all individuals provide a baseline and repeat measure on all P risk factors) but not for unbalanced data (where repeat measures for some risk factors may be missing either by design or not), although differences can often be minimal in practice. Alternatively, if sufficient repeat measures are available in all studies, then Σ and Φ may be replaced by study-specific matrices Σs and Φs reducing models (7) and (8) to S independent multivariate models. If both simplifications are appropriate, the models can be reduced to PxS independent univariate models.
Usually, the regression coefficient of repeat measure Wsirp on its corresponding baseline measure Wsi0p (represented by bspp in equation (9)) is the strongest and most important, whereas the regression coefficient of Wsirp on other baseline measures Wsi0q (represented by bspq in equation (9) where q≠p), are relatively weak and of less importance. As a result, study-specific random effects uspq or repeat-specific random effects vsrpq for q≠p can have very small variances and may lead to estimation problems, especially for large P. In such circumstances, we recommend only incorporating random effects on the regression coefficients bspp. This simplification leads to a replacement of equations (9) and (10) by the respective models
where uspp~MVN (0, ψ) and vsrpp~MVN (0, Ω), and where ψ and Ω are (P×P) variance-covariance matrices. Under these simplifications, our most general model can be defined by the equation:
We here describe our proposed estimation procedures for estimating study-specific, averaged or empirical Bayes regression coefficients from the models described in Section 3.2 (and the simplifications in Section 3.4). To obtain study-specific regression coefficients from multiple studies with a single repeat measure, we use maximum likelihood estimation to fit either (i) P×S independent univariate models defined by equation (7) or (ii) a single multivariate model defined by equation (7). To obtain averaged or empirical Bayes regression coefficients, we use restricted maximum likelihood estimation to fit either (i) P independent univariate models defined by equations (7) and (12) or (ii) a single multivariate model defined by equations (7) and (12).
To obtain study-specific, averaged or full empirical Bayes coefficients from multiple studies with multiple repeat measures, we would ideally fit the models defined by equation (13) in a one-stage estimation approach using restricted maximum likelihood. However, this model is a cross-classified multivariate model, inducing computational difficulties. Thus, we propose two further different simplifications of the model. The first is to simplify (13) to P cross-classified univariate models, estimated by restricted maximum likelihood. Study-specific regression coefficients can be obtained from P×S cross-classified independent univariate models defined by equations (8) and (12). Averaged or full empirical Bayes coefficients can be obtained from P cross-classified univariate models defined by equations (8), (11) and (12). The second simplification reduces the cross-classified model (13) to hierarchical univariate models or a hierarchical multivariate model by excluding either the within-study random effect term vsrpp or the within-subject random effect term zsip from equation (13), then estimable by restricted maximum likelihood. To obtain study-specific coefficients we propose fitting either (i) P×S independent univariate models or (ii) S multivariate models. To obtain averaged or empirical Bayes coefficients we propose fitting either (i) P independent univariate models or (ii) a single multivariate model.
We note that (restricted) maximum likelihood estimation provides valid estimates under the missing at random assumption . This is an important assumption, as generally individuals will not have complete information at all re-measurements times for all risk factors (either by design or not).
Hierarchical univariate models were fitted in STATA version 9.2 and checked using MLwiN , and cross-classified univariate models and hierarchical multivariate models were fitted in MLwiN . An example of the STATA code to fit the hierarchical univariate models is shown in the appendix.
In section 3 we built a RC model to incorporate data from multiple studies with multiple repeat measures of multiple error-prone risk factors. Study-specific, averaged or empirical Bayes coefficients could be extracted from the model and used to create correction matrices or conditional expectations. In this section, we define Correction Matrices and Conditional Expectations and discuss their practicalities and appropriateness. We also discuss extensions of the RC model to allow for time trends in measurement error corrections and for binary confounders. Finally, we propose a method to allow for the uncertainty from the RC model in the estimates of the risk associations of interest.
Two mathematically equivalent procedures use the regression coefficients from the RC model to estimate the vector of corrected risk associations and its variance-covariance matrix . The first procedure utilizes a Correction Matrix defined for a single study by:
where pq (p, q=1, …,P) are the estimated coefficients from, say, equation (6). Let un represent the vector of estimated uncorrected risk associations (e.g. log hazard ratios) with estimated variance-covariance matrix un = var (un). Corrected risk associations and corresponding variance-covariance matrix are estimated as = CM−1un and =CM−1un CM−T respectively. The Correction Matrix is an extension of the “regression dilution ratio” (RDR) used for single error-prone risk factors; the diagonal component pp represents the estimated adjusted RDR for the pth risk factor (adjusted for all other risk factors included in the RC model). Values close to one imply small levels of measurement error and values closer to zero imply greater levels of measurement error.
The second procedure replaces observed risk factors in the disease model with the linear predictors from the RC model. These linear predictors are conditional expectations of the usual risk factor, for example using estimated regression coefficients from equation (6):
The associations between E[Xip|Wi0] and risk of disease directly provide estimates and (see section 3.1).
Correction Matrices and Conditional Expectations can be created using the study-specific, averaged or empirical Bayes coefficients defined in section 3.3. Using study-specific and empirical Bayes Correction Matrices will always require a two-stage estimation approach (e.g. the first stage corrects the observed study-specific risk associations using the study-specific Correction Matrix, and the second stage combines the corrected study-specific risk associations using meta-analysis techniques). Empirical Bayes Correction Matrices which incorporate between-person heterogeneity have no practical use: instead, we propose the use of empirical Bayes Conditional Expectations. Study-specific Conditional Expectations, averaged Correction Matrices/Conditional Expectations and empirical Bayes Conditional Expectations can be used in both single and two-stage estimation approaches (see section 2.2).
It is likely that repeat measures of risk factors are measured at different times for different studies and possibly different individuals and risk factors. Our models are generalisable to such data. However, as the time separation between baseline and repeat measures increases, the strength of the relationship may decrease  resulting in the need to use time-dependent corrections2. This can be investigated by including a time interaction with the baseline risk factor Wsi0p in the RC model for Wsirp (for r =1, …,Rs). One may also wish to consider time interactions with the other baseline risk factors, although such relationships may be of less importance.
Let tsir denote the time at which the rth repeat measure was made for individual i in study s. We propose the time-dependent regression calibration model defined by:
Note that this model is incompatible with model (2) and requires the “current usual level” model (3). It may also be appropriate to consider incorporating between-study random effects on the interaction term.
Time-dependent Correction Matrices or Conditional Expectations can simply be extracted as previously described for various blocks of follow-up time (e.g. 0–5 years, 5–10 years, 10–15 years etc.)[2,8]. It is then possible either to apply the time-dependent Correction Matrices to the corresponding risk associations estimated in each of these time blocks, or to estimate the risk associations in each time block directly using the corresponding Conditional Expectations. However, time-dependent Conditional Expectations can be used more flexibly, and time need not be categorised into such blocks. Usual levels of the error-prone risk factors at time t can be estimated by replacing tsir by t in (15) and entered into the Cox model (3).
Our RC models have been built on a normality assumption for continuous variables. Measurement error can also exist in binary variables (such as smoking status), but the problem is that errors are usually correlated with the true value, invalidating one of the underlying assumptions of regression calibration (e.g. if the true value of a binary variable is 0 then the error is 0 or 1, while if the true value is 1, then the error is 0 or −1). However, correction for measurement error in binary confounders (whose coefficients are not of interest) need not address this problem , under the further assumption that the true exposure of interest (e.g. in our case, usual plasma fibrinogen) is uncorrelated with the error in the binary confounder.
For our illustration, binary smoking status is used in the RC models as if it were continuous and treated as an error-prone binary confounder. In a single study, such corrections can give valid risk associations in the disease model for the exposures, but over-correct the risk associations for the binary confounder (although a further correction factor can be applied ). We also investigated using Conditional Expectations estimated from a univariate logistic RC model; this produced similar results and is not reported here.
It is uncommon in practice to allow for the uncertainty in the RC model in the corrected risk associations. Suggested approaches have focused on using bootstrapping  which is fairly computer intensive. We propose that instead of extracting a best estimate of study-specific, averaged, or empirical Bayes regression coefficients, one draws a set of M plausible regression coefficients which add random noise to the estimated regression coefficients and incorporate the residual and appropriate random effects error. The set of regression coefficients can be used to provide a set of Correction Matrices or Conditional Expectations. For each set, corrected risk associations, say Qk, and corresponding variances, say Wk (for k=1, …,M), are obtained using standard procedures. We regard this as a form of multiple imputation  although in our case the “missing data” are the true regression coefficients, not the values of the true exposure variables. The risk associations are then pooled using rules derived by Rubin :
where “within-imputations variance”
The between-imputations variance inflates the variance of the risk associations to account for the uncertainty in the regression calibration model. In our application we use M=5 .
Our analyses are restricted to data from 27 779 individuals from five studies with repeat measurements on fibrinogen, systolic blood pressure and smoking (Table 1). In total 16 529 fibrinogen, 35 629 systolic blood pressure and 33 512 smoking status repeat measures were available from a total of 12 926 individuals at various time intervals spanning roughly 15 years in the 5 studies (Figure 1 and Table 1). Three studies provided multiple repeat measures and for such studies we identified each measurement as belonging to repeat 1, repeat 2, etc. according to cut-off times selected by inspection of Figure 1. Repeat measurements were available from surviving individuals who were not lost to follow-up. Individuals with repeat measures were generally younger, and somewhat more likely to be women and non-smokers than individuals without repeat measures13.
The analysis involved a random-effects Cox proportional hazards model (see section 2.2), stratified by sex. The median follow-up time was 8 years (inter-quartile range 7–10 years). Adjusted hazard ratios and 95% confidence intervals between coronary heart disease (CHD) and the baseline risk factors, fibrinogen, systolic blood pressure, smoking and age are provided in Table 2 (row 1).
Figure 2 presents the estimated study-specific RC coefficients using the first repeat measures of fibrinogen, systolic blood pressure and smoking (except in the Cardiovascular Health Study, when the third repeat measures of systolic blood pressure and smoking were used to correspond to the first repeat measure of fibrinogen). The meta-analysis combined adjusted RDRs for fibrinogen, systolic blood pressure and smoking status are 0.52 (95% CI: 0.46, 0.59), 0.56 (95% CI: 0.53–0.58) and 0.76 (95% CI: 0.70–0.81) respectively, shown as diamonds in the main diagonal of Figure 2. Between-study heterogeneity is assessed in terms of I2, the percentage of variance in the estimated regression coefficients from each study that is attributable to between-study variation as opposed to sampling variation . Values of I2 close to 0% correspond to lack of heterogeneity. Substantial between-study heterogeneity exists between the adjusted RDRs for fibrinogen and smoking status (Figure 2). In particular, the adjusted RDR for smoking differs greatly between the Prospective CV Munster study and other studies; a random-effects model may be inappropriate here but is fitted for illustrative purposes. Regression coefficients of repeat exposures on baseline confounders are comparatively small, although some heterogeneity between studies exists. These findings support our proposed simplifications to the model described in section 3.4. Similar estimates for the averaged RDRs were obtained from a single multivariate model defined by equation (7).
Table 2 displays the adjusted hazard ratios for CHD using Conditional Expectations constructed from various RC models and a random-effects Cox proportional hazards disease model. The hazard ratios for fibrinogen, systolic blood pressure and smoking status increase after correcting for measurement error, whereas the hazard ratio for age decreases slightly. The results across the different RC models are similar. Slightly greater hazard ratios for fibrinogen are obtained from the study-specific corrections. Accounting for the uncertainty in the RC model had very little effect on the confidence intervals for the hazard ratios. Results from a fixed-effect Cox proportional hazards disease model were similar (not shown).
Table 3 displays adjusted hazard ratios for CHD using Conditional Expectations constructed from various RC models, using all repeat measures from the five studies. The corrected hazard ratios for fibrinogen, systolic blood pressure and smoking are generally higher than those shown in Table 2, which used only a single repeat, measured within approximately seven years of follow-up, to estimate the Conditional Expectations. This is because the estimated RDRs were generally lower when later repeat information was used due to increased variability in the risk factors over time (see Figure 3). The effect of accounting for uncertainty in these models was not considered as we expected an even smaller effect than that shown in table 2 due to increased sample sizes.
The cross-classified models produced an average RDR for fibrinogen of 0.48 (95% CI: 0.43, 0.53). The corresponding between-study heterogeneity standard deviations were 0.02 (95% CI: 0, 0.05) and the corresponding within-study heterogeneity standard deviations were 0.07 (95% CI: 0.03, 0.09). Ignoring the within-study heterogeneity gave similar findings. For example, the univariate model produced an averaged RDR for fibrinogen of 0.50 (95% CI: 0.44, 0.55) with corresponding between-study heterogeneity standard deviation of 0.06 (95% CI: 0.01, 0.10). The different models also produced similar RDR results for systolic blood pressure and smoking.
Between-study heterogeneity is reflected in the different hazard ratios produced by study-specific, averaged and empirical Bayes measurement error corrections. A similar pattern exists in the corrected hazard ratios across the three general RC models: hazard ratios corrected using empirical Bayes Conditional Expectations tend to be higher for fibrinogen, but lower for systolic blood pressure and smoking status.
Similar results were observed from univariate RC models which included between-study random effects on bspq terms in equation (8) (i.e. replacing equation (12) with equation (10)), and by replacing the within-subject random effects with within-study random effects (i.e. excluding the wsip term in equation (8) but including the vsrpp term in equation (11)).
Figure 3 suggests declines in the RDRs over time for fibrinogen, systolic blood pressure and smoking status, although most RDRs estimated after 5 years of follow-up were dominated by the Prospective CV Munster study. The averaged RDRs for fibrinogen decreased by 0.025 (95% CI: 0.017, 0.032) per year, thus from these fitted values, the mean averaged RDR for fibrinogen was 0.60 at 1 year, declining to 0.38 at 10 years. Similarly, the averaged RDRs for systolic blood pressure and smoking decreased by 0.025 (95% CI: 0.019, 0.028) and 0.039 (95% CI: 0.037, 0.41) per year respectively.
Allowing for study-specific time trends in the error corrections (see section 4.3) reduced the hazard ratio for current usual fibrinogen to 1.64 (1.09, 2.48) (Table 4). This inappropriate estimate is the result of an estimated study-specific decline of 0.16 (95% CI: −0.14, 0.46) per year in RDRs for fibrinogen within the Cardiovascular Health Study, calculated from a single repeat measured over a relatively short time period (Figure 1). Using averaged or empirical Bayes measurement error corrections appropriately weights the study-specific declines, providing estimates calculated across the full follow-up period and accounting for between-study heterogeneity. These hazard ratios relate to the usual current values of the risk factors, and are similar to those shown in Table 3. Incorporating between-study random effects on the interaction term produced similar results.
The major complications arising in meta-analyses of observational studies are heterogeneity between studies and measurement error in exposures and confounders. We have addressed these complications together. We have described and compared various approaches for correcting for multivariate measurement error in the setting of IPD meta-analyses. The approaches are illustrated on repeat measures from 12 926 participants in 5 prospective studies from the FSC. The main findings from these data are: (i) there can be substantial between-study heterogeneity in regression dilution ratios (RDRs), as seen by comparing regression calibration (RC) model coefficients, (ii) there is likely to be little advantage of using a multivariate RC model over separate univariate models for each error-prone variable, (iii) RC model uncertainty has only a small impact, and (iv) time trends are likely to exist in the RC models, but subsequently allowing for such trends has relatively little effect on the disease associations. We discuss these findings below in turn.
As shown in our example, there can be considerable heterogeneity in the RDRs between different studies, and ignoring such differences can lead to biased estimates . Such biases may remain even after allowing for between-study heterogeneity in the disease model, so it is appropriate to allow for heterogeneity in the RDR estimates. We allowed for between-study heterogeneity in the RC model by using a random-effects model; alternatives include allowing one study to have a completely different RC model. In our example, incorporating further study-specific random effects on other predictors in the RC models had negligible effect on the corrected hazard ratios and was practically difficult to implement. Allowance for within-study heterogeneity was less important, although this may be of greater relevance for a meta-analysis of studies with larger numbers of repeat measures.
We compared corrections from multivariate and univariate RC models to deal with multiple error-prone risk factors. Whilst we were unable to fit the full proposed cross-classified multivariate model (equation (13)), we did explore various simplifications, all of which produced similar results suggesting a robustness to different model assumptions.
Multi-level multivariate RC models offer added flexibility by allowing different risk factor measurement error variances to be correlated within studies, and allowing within-subject variances to be correlated across risk factors. These are plausible assumptions, as some studies may employ more rigorous methods to reduce measurement errors and other sources of within-person variation, and some individuals may have consistently higher (or lower) levels of correlated risk factors (e.g. higher systolic blood pressure and higher body mass index). However, our example has shown that allowing for this can have little effect on corrected hazard ratios. This may not always be true in practice, especially for highly correlated risk factors (e.g. high density lipoprotein and low density lipoprotein cholesterol), and whilst multi-level univariate RC models are easier to implement in standard software (e.g. SAS, STATA), some checking and comparisons with the multivariate approach may be necessary.
We have shown that accounting for the uncertainty in the estimates from the RC models has minimal effect on the confidence intervals for the corrected hazard ratios. Allowing for the uncertainty was expected to make most difference in the study-specific corrections, because the standard errors from the RC models tended to be larger. However, the uncertainty in the RC models is small in comparison to the uncertainty in the estimated coefficients in the disease model.
Our proposed correction methods assume that disease risk depends either on a “usual level” or on “current usual level” of exposure and confounders. In our example, despite there being significant time trends in the RDRs, such trends had relatively little effect on the estimated hazard ratios. This may be because CHD was a relatively rare outcome (typically 10%) and because the distribution of repeat measures across the study period was similar to that of events. Time trends in the RDRs are also important when considering hazard ratios within subgroups which have different durations of follow-up, such as those formed by age at risk .
In this paper we focused primarily on the additive RC model, which has an underlying assumption about homogeneity of variance with respect to the usual values. It is not uncommon for risk factors to have a measurement error variance that increases with level [13,30] which would lead to a non-linear relationship between repeat measures. Taking log-transformations or including suitable interactions or quadratic terms in the RC model may be required. Regression calibration corrections in linear regression are valid without any assumption of linearity between repeat measures , but similar results hold only approximately for non-linear regression .
Similarly, we have assumed linear associations between exposure (and confounders) and disease, but in some cases the appropriate disease models may include interactions or non-linear terms. If such terms are known, or can be assumed, then conditional expectations can be estimated from appropriate RC models. However, assessing the shape of an exposure-disease relationship in the presence of measurement error is not straightforward. The measurement error can make the relationship appear more linear and standard methods do not allow for this . Further, we have not considered the possibility that disease risk may depend on the past history of the exposure or confounders, rather than their current levels. If, for example, the risk of disease depends on the temporal rate of change in the exposure, then regression calibration corrections are known to typically overcorrect ; life course methods would be more appropriate, although they have greater data requirements .
Multivariate measurement error correction, as undertaken in this paper, aims to estimate more closely the aetiological association of risk factors with disease. However, such corrections cannot correct for unmeasured confounders, and the potential for residual confounding will usually remain in practice. An entirely different but complementary approach to estimating aetiological associations is Mendelian randomisation , but this may also have limitations in practice [33,34,35].
The methods described in this paper have general applicability to other IPD meta-analyses. Preliminary data checks should assess the quality of the information on repeat measures (e.g. comparing baseline measures between studies with and without repeat measures and between individuals with and without repeat measures) and the assumptions underlying regression calibration methods. We recommend the use of empirical Bayes conditional expectations extracted from a RC model to encompass between-study heterogeneity. Checks should be performed to justify any model simplifications, such as ignoring within-study heterogeneity or ignoring the multivariate structure. Whilst the multivariate regression RC model is preferable to a set of univariate RC models, the latter is more computationally convenient. Even when repeat measures are available from all studies (as in our illustration), the use of study-specific corrections may result in bias from smaller or outlying studies. Combining information across studies strengthens the reliability and precision of the estimated regression calibration coefficients.
The Fibrinogen Studies Collaboration (FSC) is supported by Special Project Grant 002/02 from the British Heart Foundation. A variety of sources have supported recruitment, follow-up, and laboratory measurements in the 31 cohorts contributing to the FSC. Investigators from several of these studies have contributed to a list naming some of these funding sources, which can be found at http://www.phpc.cam.ac.uk/MEU/FSC/Studies.html.
Suppose the data are formatted in the following way:
|study||id||repeat||repeat _fib||base _fib||base_sbp||base_smok||age||sex|
and we wish to fit the following model:
This is a simplified version of equation (13) including non-error prone confounders. The STATA code and results are shown below:
xi: xtmixed repeat_fib i.study*i.repeat base_fib base_sbp base_smok age sex || study: base_fib, noconstant || id: i.study _Istudy_1-5 (naturally coded; _Istudy_1 omitted) i.repeat _Irepeat_1-13 (naturally coded; _Irepeat_1 omitted) Mixed-effects REML regression Number of obs = 16529 ------------------------------------------------------------------------------- | No. of Observations per Group Group Variable | Groups Minimum Average Maximum ------------------+----------------------------------------------------------- study | 5 1693 3305.8 5882 id | 11787 1 1.4 13 ------------------------------------------------------------------------------ Wald chi2(24) = 1151.88 Log restricted-likelihood = −17858.492 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------------------------------ repeat_fib | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+--------------------------------------------------------------------------------------- _Istudy_2 | .8041103 .2102337 3.82 0.000 .3920598 1.216161 _Istudy_3 | 1.271994 .1977429 6.43 0.000 .8844249 1.659563 _Istudy_4 | .257616 .2087866 1.23 0.217 −.1515982 .6668303 _Istudy_5 | .3751409 .205475 1.83 0.068 −.0275827 .7778645 _Irepeat_2 | −.5561327 .404178 −1.38 0.169 −1.348307 .2360417 _Irepeat_3 | −.2681136 .4038413 −0.66 0.507 −1.059628 .5234007 _Irepeat_4 | −.1557177 .36144 −0.43 0.667 −.8641271 .5526918 _Irepeat_5 | −.0534753 .4074761 −0.13 0.896 −.8521138 .7451632 _Irepeat_6 | −.1237561 .4066654 −0.30 0.761 −.9208056 .6732934 _Irepeat_7 | −.1187755 .4066415 −0.29 0.770 −.9157782 .6782271 _Irepeat_8 | −.224244 .4072799 −0.55 0.582 −1.022498 .5740099 _Irepeat_9 | −.2488897 .4077776 −0.61 0.542 −1.048119 .5503396 _Irepeat_10 | −.3015269 .4078587 −0.74 0.460 −1.100915 .4978615 _Irepeat_11 | −.2141387 .4080348 −0.52 0.600 −1.013872 .5855948 _Irepeat_12 | −.198489 .4083559 −0.49 0.627 −.9988519 .6018739 _Irepeat_13 | −.0237129 .4095358 −0.06 0.954 −.8263883 .7789624 _IstuXre_4_2 | .3080618 .0478986 6.43 0.000 .2141823 .4019412 _IstuXre_4_3 | −.128406 .4089363 −0.31 0.754 −.9299064 .6730945 base_fib | .497483 .0271787 18.30 0.000 .4442137 .5507522 base_sbp | .0021802 .0003385 6.44 0.000 .0015167 .0028437 base_smok | .1248417 .0142951 8.73 0.000 .0968237 .1528596 age | .0056616 .0009317 6.08 0.000 .0038354 .0074878 sex | .1040671 .0132093 7.88 0.000 .0781773 .1299569 _cons | .7814809 .3736456 2.09 0.036 .0491489 1.513813 ----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------- Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------------+----------------------------------------------------------------- study: Identity | sd(base_fib) | .0553705 .0230456 .0244908 .1251855 -----------------------------------+----------------------------------------------------------------- id: Identity | sd(_cons) | .3813449 .0087969 .3644872 .3989822 -----------------------------------+----------------------------------------------------------------- sd(Residual) | .6135454 .0053735 .6031034 .6241682 -----------------------------------------------------------------------------------------------------
The overall RDR for fibrinogen from this model, given by the coefficient of base_fib, is 0.50 (95% CI 0.44, 0.55). Between-study heterogeneity in the RDR, given by the standard deviation sd (base_fib), is 0.06 (95% CI 0.02, 0.13), suggesting that the RDR in different studies differ more than would be expected by chance. The amount of individual-specific variation and residual error are represented by their standard deviation estimates 0.38 (95% CI 0.36, 0.40) and 0.61 (95% CI 0.60, 0.62) respectively.
Authors/Writing Committee of the Fibrinogen Studies Collaboration: A. M. Wood, I. R. White, S. G. Thompson.
Authors/Members of the Fibrinogen Studies Collaboration: Aspirin Myocardial Infarction Study: J. B. Kostis, A. C. Wilson; Atherosclerosis Risk in Communities Study: K. Wu; Bezafibrate Infarction Prevention Study: M. Benderly, U. Goldbourt; Bruneck Study: J. Willeit, S. Kiechl; Caerphilly Study: J. W. G. Yarnell, P. M. Sweetnam, P. C. Elwood; Cardiovascular Health Study: M. Cushman, R. P. Tracy (see http://chs-nhlbi.org for acknowledgments); Copenhagen City Heart Study: A. Tybjæg-Hansen; European Concerted Action on Thrombosis and Disabilities (ECAT) Angina Pectoris Study: F. Haverkate, S. G. Thompson; Edinburgh Artery Study and Edinburgh Claudication Study: A. J. Lee, F. B. Smith; Finnish National Risk Factor Survey 1992, Hemostasis Study: V. Salomaa, K. Harald, V. Rasi, P. Jousilahti, J. Pekkanen; Framingham Study: R. D’Agostino, P. W. F. Wilson, G. Tofler, D. Levy; GISSI-Prevenzione Trial: R. Marchioli, F. Valagussa*; Göteborg 1913 and Göteborg 1933 studies: A. Rosengren, G. Lappas, H. Eriksson; Göttingen Risk Incidence and Prevalence Study: P. Cremer, D. Nagel; Honolulu Heart Program: J. D. Curb, B. Rodriguez, K. Yano; Kuopio Ischaemic Heart Disease Study: J. T. Salonen, K. Nyyssönen, T.-P. Tuomainen; Malmö Study: B. Hedblad, G. Engström, G. Berglund; MONICA/KORA Augsburg Study: H. Loewel, H. W. Hense; Northwick Park Heart Study I: T. W. Meade, J. A. Cooper, B. De Stavola, C. Knottenbelt; Northwick Park Heart Study II: G. J. Miller*, J. A. Cooper, K. A. Bauer, R. D. Rosenberg; Osaka Study: S. Sato, A. Kitamura, Y. Naito, H. Iso; Platelet Activation and Inflammation Study: V. Salomaa, K. Harald, V. Rasi, E. Vahtera, P. Jousilahti, T. Palosuo; Prospective Epidemiological Study of Myocardial Infarction: P. Ducimetiere, P. Amouyel, D. Arveiler, A. E. Evans, J. Ferrieres, I. Juhan-Vague, A. Bingham; Prospective Cardiovascular Münster Study: H. Schulte, G. Assmann; Quebec Cardiovascular Study: B. Cantin, B. Lamarche, J.-P. Després, G. R. Dagenais; Scottish Heart Health Study: H. Tunstall-Pedoe, G. D. O. Lowe, M. Woodward; Speedwell Study: Y. Ben-Shlomo, G. Davey Smith; Strong Heart Study: V. Palmieri, J. L. Yeh; Thrombosis Prevention Trial: T. W. Meade, P. Brennan, C. Knottenbelt, J. A. Cooper; Physicians’ Health Study: P. Ridker; Vicenza Thrombophilia and Atherosclerosis Project: F. Rodeghiero, A. Tosetto; West of Scotland Coronary Prevention Study: J. Shepherd, G. D. O. Lowe, I. Ford, M. Robertson; Whitehall II Study: E. Brunner, M. Shipley; Zutphen Elderly Study: E. J. M. Feskens, D. Kromhout.
Authors/Coordinating Centre of the Fibrinogen Studies Collaboration: E. Di Angelantonio, S. Kaptoge, S. Lewington, G. D. O. Lowe, N. Sarwar, S. G. Thompson, M. Walker, S. Watson, I. R. White, A. M. Wood, J. Danesh (coordinator).
Conflict of interest: none declared.
The following FSC investigators contributed data to the current study but did not participate as co-authors: A. R. Folsom, L. Chambless, B. M. Psaty, M. P. M. de Maat, F. G. R. Fowkes, E. Vahtera, W. B. Kannel, L. Wilhelmsen, W. Koenig, A. Rudnicka.