These results show that EFA can work to recover “latent” dimensions or constructs in collections of variables –replicating the “Thurstone Box Problem” with the permuted versions of the neuroimaging outcomes from the ADNI data set – but they do not always work. Factor analytic methods often rely on principal components (PC) extraction, when the number of variables are not pre-specified (

11,

18); PC extraction seeks the model that maximizes variance explained (in terms of eigenvalues or variance accounted for) (see

14). When the number of factors is pre-specified and the factor loadings are estimated (as in maximum likelihood factor analysis, MLFA, see

11 and

18), the analysis estimates correlations between observed and latent variables that maximize the likelihood of the number of factors the investigator pre-specifies (

11). MLFA does not have the same data-driven disadvantage that EFA does, but cannot accommodate chains of factors or higher order structure in the factors. The default setting for “factor analysis” in nearly all software that offers this function is principal components extraction, with MLFA being more difficult to obtain and interpret.

The ADNI variables chosen for the analysis were selected because they should be combinable into a measurement model representing disease burden, atrophy, or ‘AD’. One factor was extracted from the original variables that were manipulated into the nonlinear functions/combination; this single factor explained nearly 60% of the variance in the four variables. In spite of explaining so much of the variability in the four variables from different imaging modalities and areas of the brain, the fit of the hypothesized one-factor model suggested very poor fit. This example was chosen to highlight the weaknesses of exploratory factor analysis in the combination of biomarkers for clinical trials in AD: using EFA is not recommended as a method for obtaining a BMMI for clinical research.

This study replicated the “Thurstone box problem” where nonlinear and highly correlated variables were analyzed with EFA and recovered the underlying ‘factors’. This suggests that EFA can uncover structure, even when the variables are hard endpoints that are correlated, such as the four ADNI variables whose permutations were analyzed. However, the second EFA analysis of the original four variables demonstrated that EFA does not always uncover the correct structure.

Rather than using EFA (MLFA or PCFA), a theoretically-driven measurement model (MM) would efficiently combine the information from across the variables representing AD, burden or atrophy from brain and activity (PET) or atrophy (volumes) – together with theory – to create a single outcome that would not be subject to the same disadvantages as the mathematically-optimized, atheoretical ‘model’ that might be obtained otherwise (i.e., from EFA or other multivariate techniques). A combination of theoretical and statistical insight should be applied to build a measurement model for CFA rather than using EFA. Exploratory and confirmatory methods provide different information; the theoretical and statistical fits of any latent variable model must be established and EFA cannot provide these, but other methods, such as CFA, can. The critical point is, however, that building and testing measurement models is complex. Burnham and Anderson (2002) (

8) note that a set of candidate models is important for choosing (or combining) models for inference (p. 2), and by extension, for selecting the ‘best’ measurement model as indicator for clinical trials. This means that a BMMI cannot simply be obtained by finding a single model that fits the data; the best MMI will be a model that not only fits the data but also fits better than reasonable alternative models (

19,

20).

This paper described some of the advantages and disadvantages of seeking/using a BESI in regression given the complexity of AD, and of EFA as a mechanism for circumventing the undesirability of one BESI in clinical trials for diseases as complex as AD. A third analytic method is available: explicit latent variable modeling such as CFA. This is a class of multivariate analytic methods that could lead to a single-variable representation of a larger set of observed variables, or ‘indicators’. Unlike EFA, the investigator builds, and then tests the fit of, a model specifying hypotheses about how the larger set of variables represents a hypothesized underlying, unobserved, entity. The model is referred to as a ‘measurement model’ (see

21). In the context of AD and finding biomarkers that are optimized for differentiating patients along the continuum of neuropathology, a set of biomarkers would be identified and hypothesized to represent one or more latent variables in specific ways. For example, within the ADNI data, ‘neurodegeneration’ might be a hypothesized clinical entity that causes volumetric imaging outcomes as well as levels of tau, to covary with low levels of glucose metabolism. In this example, variables reflecting amyloid (e.g., PIB uptake or A 42) would be expected to covary as a direct function of (i.e., caused by) neurodegeneration. This is not to imply that levels of all of these biomarkers are unrelated; only that the specific clinical entity “neurodegeneration” is hypothesized to cause decreases in volumes, tau, and lower glucose metabolism. Thus, unlike EFA, measurement models combine variables in hypothesis-driven ways– and can therefore be more generalizable across samples. When the best model has been built, tested and validated (i.e., replicated in an independent sample), it would be the “best measurement model indicator” (BMMI (

6)), which itself could then be used as a BESI for regression in clinical trials.

Similar to EFA, the associations of the observed variables with the underlying entity can be estimated using maximum likelihood methods (

21) and the most straightforward modeling will follow from relationships that are linear (although nonlinear relationships can be modeled with latent variable modeling techniques; see (

22) for a variety of complex latent variable methods and techniques). Unlike EFA and MLFA, causal chains and higher-order latent variable models can be hypothesized and tested in LV methods other than EFA/MLFA.

Latent variable (LV) methods simultaneously model multiple indicators of an entity (e.g., “disease burden”) by regressing observed variables on the hypothesized unobserved, underlying, or ‘latent’ one(s). It is recommended that searches for biomarker measurement models focus on causal models, where the unobserved clinical entity is the hypothesized cause of the covariances among the observed variables (biomarkers) (see (

21) for discussion of causal vs. emergent latent variables). Importantly, in a causal model, the extent to which the causal factor (latent clinical entity) does not cause the variability in any observed variable is explicitly modeled and estimated – as ‘measurement error’. This is an important aspect of a latent variable causal model, since in standard linear/multiple regression, the residuals in the model represent the error with which the independent variable(s) represent or predict the dependent variable plus the error with which the dependent variable represents whatever it is supposed to represent. Within a LV model, the latter source of error is modeled explicitly, so that the former type of error can be estimated. This is not the case in EFA (or MLFA) models, nor is it a feature of any composite-forming method of multivariate analysis.

Once a causal model is hypothesized and the relations between the observed variable and the latent cause are estimated using specialized software (EQS, SAS, SPSS/AMOS, MPlus, R), the fit of the model to the observed data is estimated. This is roughly equivalent to obtaining the R2 for a regression model, and in some software the R2 is computed for the regression of each observed variable onto the latent variable(s). However, in addition to these indicator-specific summaries, many indices of overall fit of the model to the data are computed – such as those represented in , including areas of particular misfit (e.g., if the hypothesized relationship between the latent cause and one indicator is unsupported by the data).

The LV approach can yield a complex ‘dependent variable’, and given adequate fit of the measurement model to the data –as well as better fit than reasonable alternatives - this new dependent variable can be considered the ‘Best Measurement Model Indicator’ (BMMI). As described above, a causal measurement model represents a latent entity that several observed variables reflect or measure. The BMMI can accommodate all indicators of the underlying entity (e.g., “neurodegeneration”), and so can incorporate multiple ‘dependent variables’ into a single dependent ‘model’ variable, as well as estimating the relative contributions of the latent factor to each indicator. This is in contrast to regression based techniques (including those underlying model averaging and other combinations of regressions) where independent variables must (by assumption) be independent (orthogonal) to one another. Thus, it is inappropriate to include correlated variables as independent variables within linear regression, whereas a measurement model approach takes advantage of the correlations among variables.

The selection of a single-variable BESI is an artifact of regression that limits the investigator’s ability to utilize all relevant variables representing the entity of interest. EFA and other data-driven, atheoretical multivariate methods result in sample-specific single (composite)-variable combinations of biomarkers that might not generalize to a new sample and can sometimes uncover the correct structure, but not always. By contrast, the BMMI approach is a theory- and hypothesis-driven simultaneous analysis of multiple ‘dependent variables’ which are indicators of the underlying clinical entity. It requires extra work, but its accommodation of multiple and correlated variables, and its explicit modeling of error, make this extra modeling effort worthwhile. This is particularly true in cases where, as in AD research, previous research has shown that no single variable can serve as the best biomarker.

As Box said, “all models are wrong, but some models are useful” (

23). The assumptions, implications, and penalties for building and testing a BMMI are similar to those common to regression, multivariable systems, and measurement modeling found in any multivariate statistical textbook (more technical (

5); more accessible: (

24)). The main disadvantage of using BMMIs is that they must be built, tested and validated (see, e.g., (

20,

25)), which is far more time consuming than selecting a BESI, using EFA, or creating a composite or index. However, combination-of-models methods also require this attention to alternative models (

8).

A measurement model is derived from the fact that it conceptualizes a construct, such as “neuropathology”, that can be measured in several different ways, all of which are subject to some type and extent of measurement error, and all of which are of interest/important to a complete appreciation and neuropathology in AD, MCI, and normal cognitive aging. The measurement model will not be “true” or “correct”, but it represents the optimal combination of theory and statistics. Thus, the Best Measurement Model Indicator (BMMI (

6)) will ideally articulate a unidimensional (latent) construct to be measured, which in the current example could be “neuropathology”. “Neuropathology” can ONLY be estimated, it can never be directly (or completely) quantified or observed; moreover, its estimation/quantification will be optimized by increasing the number and quality of indicators that are hypothesized to be caused by it (

26). By incorporating uncertainty and permitting multiple indicators, a measurement model improves estimation of the ‘truth’ or ‘true level’ of that construct in which we are most interested.

The BESI does not permit the simultaneous analysis of these indicators, but only the combination (through argument) of results from multiple regressions (on BESIs). For this reason, quite apart from the failures of any genetic, imaging, or biologic measure to attain the sensitivity and specificity that a biomarker for AD requires (

1,

4), building, testing and validating a BMMI is recommended over choosing a BESI from among the collection of ADNI measures.

Latent variable methods have a natural place in biomedical research. The results presented here show that using data-driven methods such as exploratory factor analysis will not necessarily uncover the ‘true’ relations among a set of biomarkers. Instead, a BMMI represents a combination of theory and statistical support for that theory, taking advantage of all relevant indicators-even if they are correlated, and importantly, the fit of model to data can be quantified, and replicated in new samples. A BMMI will explicitly model measurement error of indicators and, if a causal BMMI is built and validated, then a single target for any intervention would be identified.