By modelling jointly four psychometric tests and their latent common factor, we have been able to compare their distributions in all the range of cognition. In this way, we showed that MMSE and BVRT were not sensitive to cognitive changes in high levels of cognition and thus were not appropriate to study cognitive ageing in prospective studies including highly educated people. On the contrary, we showed that the DSST was very sensitive to cognitive changes in high levels of cognition. However, as it was less sensitive to cognitive changes in low levels of cognition, it could neither be suitable for measuring cognitive changes in heterogeneous populations consisting of both highly normal and severely impaired subjects. In contrast, the IST15 appeared to be a satisfactory cognitive measure in all the range of cognition which is of substantial interest when studying cognitive ageing in population-based cohort studies.
The IST15 has several assets comparing with the three other tests. Firstly, it does not suffer from a floor effect or a ceiling effect. Indeed, using cognitive measures with border effects can lead to misleading results (especially under-estimated declines) when investigating cognitive changes since initial scores are often differentially distributed among exposure groups and the sensitivity of the tests to identify cognitive changes is thus different among these groups (1
). Secondly, the Isaacs Set Test when shortened at 15 seconds as well as the DSST includes a speed component which may explain its high sensitivity to changes in upper levels of cognition. Indeed, the speed component plays a key role in cognitive ageing and it has been shown for example that most of age-related differences in cognition were due to the decrease in the processing-speed (18
). Lastly, the IST15 is a very brief test and its instructions are easily understandable. It can therefore be performed in large population-based studies even with severely impaired subjects.
The methodology we proposed in this paper has several advantages that should be discussed. Firstly, the estimated link functions between the test scores and the latent process make it possible to compare properties of the tests and, especially, their sensitivity to detect cognitive changes within the entire range of cognition. This is done by modelling jointly various psychometric tests for which the hypothesis of a common factor is sensible. By the way, it is worth noting that the latent common factor in this model is actually defined according to the pool of psychometric tests used in the analysis. Computing the model with other tests involving different cognitive components could have an impact on the common factor evolution. In this analysis, we used tests which both are frequently used and explore different domains of cognition because we wanted to select one test for exploring general cognitive decline in heterogeneous populations. The methodology could also be used for selecting sensitive measures in a specific domain of cognition. In this case, based on his/her knowledge or on other analyses such as principal component analyses, the researcher must choose the tests which are assumed to measure the same latent cognitive ability in this specific domain and then apply the methodology to the selected tests.
A second asset of the methodology is that, thanks to the estimated transformations of tests, the tests are no longer constrained to follow a Gaussian distribution as in a standard linear mixed model. In this way, even if longitudinal evolutions of the four tests, as presented in , could have actually been estimated using linear mixed models, they would have been obtained under the wrong Gaussian assumption.
Lastly, as parameters are estimated using the Maximum Likelihood Estimators, results are robust to data missing at random (i.e. when the probability that a data is missing does not depend on unobserved values given the past observed values). Simpler analyses which aim at comparing empirical means of the tests for different age groups are often biased by the missing data process, especially when the cognitive level and the dropout are linked as it was previously shown in the Paquid cohort (17
). In this previous work, it was also shown that the missing at random assumption was probably not strictly true, but the impact on the estimated evolution was slight (17
). Moreover, even if missing data may blur the comparison of evolution of the tests scores, it is very unlikely that they biased the comparison of test sensitivity which is the main objective of this paper. This was checked by comparing transformations estimated on four sub samples defined by the time of dropout (dropout after V3, V5, V8, V10 or complete follow-up) in the spirit of pattern mixture analysis (20
). Whatever the pattern of dropout, the estimated transformations were very similar (results not shown).
Some methodological issues of this analysis should however be discussed. Firstly, as the results rely on a parametric model, adequation of the model to the data has been carefully checked using post-fit methods based on the residuals and the predictions developed in Proust et al. (4
) (results not shown). An essential part of the model is the link function between the tests and the common factor. The Beta CDF was chosen because this transformation was flexible enough to exhibit very different shapes and depended on only two parameters per test. However, complementary analyses have been performed estimating the link functions on a basis of splines instead of the Beta CDFs; they have led to very similar results while raising more numerical problems due to the large number of parameters.
Secondly, in the PAQUID study, MMSE was the first test fulfilled during each testing session. Consequently, it was more frequently completed than the three other tests, particularly among impaired subjects. To ensure that test-specific parameters were estimated on the same sample and to maintain comparability between the tests, we required that every subject had at least one measure at each test. The 791 subjects excluded from the sample were older (median of 78.6 vs. 73.1) and less educated (51.5% did not graduated from primary school vs. 27% in the sample) than the subjects included in the selected sample but the range of the observed scores was the same. Note also that using longitudinal data and keeping incident cases of dementia in the sample increased the observed range of cognition and allowed us to compare evolution of each test over time.
In conclusion, our results show that the Isaacs Set Test shortened at 15 seconds could be a good candidate to measure cognitive changes in a general population. More generally, the methodology used in this study provides some clues to thoughtfully select the appropriate measures of cognition collected in a study according to the nature of the target population and the objective of the study.