BP and HR Data
SBP measurements were excluded if outside the range of 60–220 mmHG. DBP measures had to be within 40–110 mmHG. HR measures during the reclined period had to be between 30–175 bpm. During all other periods, orthostatic, Stroop-baseline, color-word, pain-affect had to be between 40 and 200 bpm.
For the Rest Period (1), Mean BP measures (SBP, DBP, MAP, and HR) were averaged over the last three recordings assessed at minutes 15, 17.5, and 20 during the rest period. For the Orthostatic Period (2), the changes in BP: ΔSBP, ΔDBP, ΔMAP, and ΔHR from the values obtained during the Rest Period were determined by subtracting the first value measured in the orthostatic period from the mean values obtained at minutes 15, 17.5, and 20 during the rest period. For the Stroop Periods (4&5): mean BP measures (SBP, DBP, MAP, and HR) were averaged over the five minute data collection period at minutes 1:00, 2:00, 3:00, 4:00 and 5:00. Changes in BP: ΔSBP, ΔDBP, ΔMAP, and ΔHR from the pre-Stroop values (Period 3) were determined by subtracting the average value measured in the pre-Stroop period at minutes 5:00, 7:30, and 10:00 from the mean of the values obtained during the Stroop tasks, with values at minutes 1:00, 2:00, 3:00, 4:00 and 5:00 to compute the mean value.
The ratio of HR÷MAP was computed as a measure of the relative balance of cardiosympathetic versus cardioparasympathetic (vagal) tone at a given baroreflex set point (MAP). Higher values reflect greater cardiosympathetic versus cardioparasympathetic tone while lower values signify greater cardioparasympathetic versus cardiosympathetic tone at a given baroreceptor set point.2, 35, 90
Imputation for missing values used the EM method in SAS proc MI which finds maximum likelihood estimates for incomplete data under the assumption that the data are multivariate normal.80
Descriptive statistics for each summary score were generated using non-imputed data. Statistical significance of differences in mean scores was evaluated using the t-test derived from a least squares linear model in which study site was a covariate. Study site was used as a covariate because operational requirements during recruitment created different proportions of cases among sites (see Slade for further details). The relationship between each summary score and occurrence of TMD was expressed as the standardized odds ratio (SOR), calculated from an unconditional, binary logistic regression model with study site as a covariate. To achieve this, the summary score was transformed to a unit-normal deviate. The transformation meant that odds ratios could be interpreted as the relative change in odds of TMD for each standard deviation of change in the summary score. A second logistic regression model generated a fully-adjusted estimate of the relationship, using additional covariates of age (in years), gender, and race/ethnicity (dichotomized as white or non-white). A third logistic regression model using this imputed dataset calculated SORs for each summary score, with adjustment for the same covariates as in the prior model. Odds ratios with values less than 1.0 were reverse coded for ease of presentation and interpretation.
All P-values were computed without adjustment for multiple tests, and we therefore refrain from designating P=0.05 as a threshold for statistical significance. In this paper’s case-control analysis, up to 15 autonomic variables were investigated for any experimental condition, Resting epoch assessed 11 variables (Note: blood pressure cuff assessed Mean HR is equivalent to ECG assessed Mean HR (HRV) and were treated as a single entity for this calculation), Orthostatic epoch 11 variables, Stroop Color Word epoch 15 variables and Pain-Affect 15 variables. Therefore, the most conservative Bonferroni correction for the probability of type I error for a given experimental period would yield a critical P-value of 0.05÷15 = 0.003. Using the same rationale, rejection of the null hypothesis concerning odds ratios would occur only if the 99.6% confidence interval excluded the null value of one. In general, though, we avoid drawing conclusions about statistical significance of associations, even with correction for multiple tests, because these papers report only univariate- or demographically-adjusted results. Furthermore, the Bonferroni adjustment is probably overly-conservative in this setting, where several measures are moderately correlated. Instead, we will reserve judgments about statistical significance to subsequent papers that will use multivariable modeling to consider multiple characteristics simultaneously, as proposed in the OPPERA heuristic model (Maixner et al. this volume).
Principal component analysis (PCA) was applied to this data set to identify putative latent variables. The approach began with four steps, widely used in exploratory principal component analysis69
: 1) variable selection; (2) evaluation of the correlation matrix; (3) extraction of initial components; and (4) rotation and interpretation of component loadings. This model is being used for purely exploratory purposes with regards to the structure of the latent variables and not for computation of summary scores for the various components.
Our primary interest focused on PCA loadings in controls for three reasons: (a) there are many more controls than cases, thereby improving statistical power to identify factors and estimate loadings; (b) our underlying conceptual model of TMD (see Maixner et al. in this volume) proposes that relationships between putative risk factors might alter following onset of chronic pain, and we wanted to evaluate that possibility; and (c) the long-term goal of OPPERA is to identify risk factors for TMD among controls, so it is desirable to identify a reduced set of variables for that group. Thus, we fit separate PCA models for the TMD cases and controls.
We elected to keep the 42 variables listed in in the model even though some of the variables were highly correlated with one another. We wanted to determine which measures could be combined into latent variables, and this required that all the variables be retained in the model. Although this has the potential to increase the variance of our estimates of the PCA loadings, this was not a major concern given our large sample size. As shown below, we estimated the variance of our PCA loadings using bootstrapping, and these variances were uniformly low despite the correlations among the input variables. The only variables we did not include in the model were variables derived from two or more of the underived variables (e.g., HR/MAP Index), since the inclusion of such variables would create both computational and interpretative difficulties.
Component Loadings for PCA Model in Controls (n=1626)
After imputing missing values as described above, 348 subjects still had at least one missing value among the variables included in the model. Because missingness may be non-random, we were concerned with bias if we dropped these 348 subjects from the model. To avoid this potential bias, we performed a second round of imputation. Specifically, we imputed any missing BP measures based on the observed BP measurements in the same epoch and we imputed missing HR measurements based on the observed HR measures in the same epoch. We then imputed any remaining missing BP measurements based on observed BP measurements in other epochs and missing heart measures based on observed HR measures in other epochs. We used the EM algorithm to perform this imputation under the assumption that the data are multivariate normal, as described previously. After performing this second round of imputation, there were only nine remaining subjects with missing data. It should be noted that we report the SOR’s for both the imputed and unimputed data sets (see –), and in all cases the results are extremely similar (if not identical). And while we do not report these results, we also fit our PCA model using only unimputed data, and we obtained a model very similar to the model reported in the manuscript.
Baseline Resting Autonomic Measures – Reclined (20 min)
Pain-Affect Stroop Autonomic Measures (5 min)
We examined a scree plot to estimate the number of components to include in the model (). The variance explained by each principal component decreased most conspicuously after the fifth component, suggesting that at least five components are needed. We used parallel analysis to verify this estimate. Parallel analysis estimates the number of components to include in a PCA model by generating random data sets with the same numbers of observations and predictor variables as the original data. The eigenvalues are computed for each random data set and averaged over all the data sets. When the average eigenvalue from these randomly generated data sets is larger than the corresponding eigenvalue of the original data, then the principal component associated with that eigenvalue is likely to be random noise. The parallel analysis also showed strong evidence that components one through five were above the chance line (). Components six, seven, and eight were near the chance line, so it was unclear if they should be included in the model.
Figure 1 Scree plot and parallel analysis from PCA. The vertical lines depict the eigenvalues for each component, showing a significant drop off after component 5. Also, the open triangles and red line show the eigenvalues that would be expected for each component (more ...)
We therefore fit PCA models using five, six, seven, and eight components. The bootstrap confidence intervals for components six, seven, and eight were very wide for many of the loadings, suggesting that the estimated loadings were unstable. (Data not shown) However, the confidence intervals for the loadings of the first five components were very narrow, indicating a stable (and accurate) model. Additionally, the Cronbach’s alpha values for these five components were very high—ranging from 0.92–0.96—further supporting a good reliability of this model. Thus, we report a model based on five components.
PCA models were fit using the R statistical computing platform. All variables were normalized to have mean 0 and standard deviation 1 prior to fitting the models. After calculating the PCA eigenvectors, a promax rotation was applied to increase the interpretability of the resulting PCA loadings. The promax rotation produced loadings that were easier to interpret than the loadings resulting from orthogonal rotations, and other non-orthogonal rotations produced similar results. The rotated loadings are presented unless otherwise noted. The variance in the PCA loadings of each model was estimated by drawing 1000 bootstrap samples for each data set and fitting a PCA model for each replicate. The 95% confidence bounds for the PCA loadings were estimated to be the 2.5% and 97.5% quantiles of the corresponding loading over the 1000 bootstrap replicates.21
We performed a series of permutation tests to test the null hypothesis that the loadings were the same for both the control and TMD models, and we found that these differences were not statistically significant, particularly considering that we are performing multiple hypothesis tests (Supplementary e-Table 20
). It is unclear if the loadings are truly identical for both models or if we simply do not have power to detect the difference given the relatively small number of TMD cases. The confidence intervals for the PCA loadings in the model based on TMD cases are very wide for components 2–5 (Supplementary e-Table 19
). This is not surprising, given that the case sample size was relatively small, but it indicates that the estimates of these loadings have high variance and may not be reliable.