This study identified seven orthogonal factors that reflected a number of putative component ASD traits. These included verbal ability, language acquisition, semantic-pragmatic skills, social understanding, repetitive-stereotyped behaviour, articulation and social inhibition. All were related to ASD outcome.
We identified more factors than in previous reports for a number of reasons. First, the large sample size of this study compared to previous investigations provided extra power to detect more minor factors. Second, this was a population based cohort in which measures were collected at different points in development. This helped to identify less major factors partly because the sample encompassed the full range of responses compared to clinical samples but also because the use of repeat measures helped to increase the proportion of variability in the data associated with such factors. Finally, we included a wide range of measures in this study. In contrast, some previous studies only analysed composite scores rather than the individual measures, for instance, the 12 subscales of the Autism Diagnostic Interview – Revised diagnostic instrument
[13],
[14]. This may have limited their scope to detect multi-factorial solutions. But it is important to note that some differences are attributable to the method chosen to identify the number of factors. In this study, we found a wide range of possible solutions based upon different criteria but chose the seven factors based upon parsimony and interpretability. Other studies may have also identified a larger number of factors but chosen to interpret this as a fewer number based upon a single criterion such as variance explained before rotation
[11].
The factors we identified showed some similarities to the factors reported in two previous studies
[15],
[43]. For instance, the identification of language milestones and the role of imaginative play has been not been frequently reported but is consistent with Factor 2 in this study. However both of these studies differentiated between different aspects of repetitive behaviour and restricted interests not found in this study. This may reflect the fact that there were comparatively few measures of this latter type (e.g. insistence on sameness) included in this study. The most consistent findings across studies concerned the identification of factors pertaining to social-communication and repetitive interests and behaviours
[9],
[12]–
[15],
[43],
[44]. This study also identified factors relating to these major domains of function, although our findings indicated that within the main domains, there was evidence for further fractionation of the phenotype, with 4 factors related to communication, two with social and one with repetitive domains. Despite these overall consistencies, differences in the detailed factor structures from previous studies were observed
[9]. These differences might be attributed in part to their cross-sectional nature and the possibility that their data reflected transient states. Our longitudinal study was in a stronger methodological position to identify the more enduring traits which might be expected to produce a more stable and reproducible factor structure.
All seven factors were independently associated with ASD diagnosis and the combined factor score showed a high sensitivity to diagnostic status, reflecting the cumulative contribution of the individual factors to diagnosis. The individual factor scores did not predict ASD status as well as some of the individual measures. This may reflect the fact that the individual measures that best predicted an ASD diagnosis (e.g. the CCC scores) were often specifically developed to measure ASD traits. Moreover, some of these individual measures were collected after the child had been diagnosed with ASD, so they may have been subjected to more reporting bias. The approach we have adopted here of relating factor scores and individual measures to ASD status has the advantage of helping to identify those measures that may be most informative for future research from amongst the wide number of putative traits available. This approach can help to circumvent the problems of multiple testing that arise when investigating aetiological determinants of the richly characterized and complex phenotypes observed in large data sets such as ALSPAC.
Previous research has suggested that different components of the ASD phenotype may have different aetiological origins
[8]. While this study has shown that a number of traits, whether individual measures or derived measures from factor analysis, have independent contributions to the diagnosis of ASD which adds support to this hypothesis, in practice, this may not be sufficient. Some have argued that such traits may have more association with obtaining a diagnosis than the underlying biological processes
[45]. As a further exploration of this issue, the associations of the identified factors and individual measures with four genetic correlates within the cadherin and contactin genes were examined. Different genetic variants were associated with different factors – in particular Factor 2 (
Language acquisition), Factor 4 (
Semantic-pragmatic skills), Factor 7 (
Social inhibition) and the Factor mean score. The results partially replicate previous reports from studies of individuals with ASD, where associations were reported for age at first word and expressive language, but also extend their findings
[17],
[18]. While pleiotropic effects may contribute to some of the heterogeneity in the ASD phenotype
[46], as observed in this study for the contactin variants, the contrast in results with the cadherin variant favoured a broader phenotype with differentiable components and more complex aetiological origins.
A recent study related the same cadherin SNP with 29 measures encompassing language, communication, social interaction and behavioural traits
[47]. Consistent associations were observed with only one measure showing an effect opposite to the expected direction. In contrast, we found one out of 4 individual measures and 5 out of 7 factors with this unexpected direction to the best estimate of the effect size. While that study found a significant joint association even amongst those traits with weaker associations, our results, ignoring Factor 4 (
semantic-pragmatic skills), are more consistent with a null association overall and may re-enforce the conclusion that our identified traits, especially the factors, encompass greater heterogeneity. The strong association for Factor 4 is consistent with that study's report of an association with
CCC – stereotyped conversation 9y.
It was notable that the analyses of measures taken at different points in development supported the notion that the phenotypic architecture of the broader autism phenotype unfolds and becomes more differentiated with development. The implication is that aetiological studies need to take these developmental changes into consideration and recognize that genetic and environmental influences may operate developmentally and may differ in importance at different ontological stages.
This study has also shed light on some statistical issues. Some debate has occurred on whether oblique or orthogonal rotation should be used in factor analyses
[48]. While it is true that oblique rotations can produce orthogonal factors if appropriate to the data, it is clear from our study that relatively high correlations between oblique factors may result from relatively marginal changes to the factor structure. Our study also showed that an overall orthogonal association does not necessarily imply orthogonality at the worst extremes of the factor scores where pathology may be most evident. Overall, these findings may detract from the theoretical advantages of oblique rotation methods and favour orthogonal methods especially in population-based samples. It has also been suggested that the variance explained by the retained factors should usually be less than 100%
[49]. While some consider that the presence of negative eigenvalues implies that the positive eigenvalues are overestimated and even to retain factors explaining 100% of the variance would be an over-factorisation, others see the negative eigenvalues as a facet of underestimating the communalities
[50]. It is difficult to generalise from our study, but the presence of a single factor explaining 108% of the variance found in one analysis suggests that underestimation of communalities should not be discounted.
This study has some potential limitations. The individual measures accessed from the ALSPAC database were in general not specifically designed to assess ASD. While this strategy of including questions for a range of health and developmental outcomes may have omitted some traits more specific to ASD, our results suggest a significant portion of the variability associated with ASD has been explained. Self-completed questionnaires were the major source of data with 88 of the 93 individual measures being obtained in this way. This contrasts with diagnostic tests, such as the Autism Diagnostic Observation Schedule – Generic or the Autism Diagnostic Interview – Revised, which require trained personnel. Despite this potential limitation, maternal reporting has been shown to have high sensitivity for detecting global developmental deficits
[51]. Finally, many of the standard measures were abbreviated for pragmatic reasons. While this raises concerns over their comparability with the full form, such short forms have been shown to have acceptable reliability eg
[52].
In summary, this study has identified seven factors reflecting aspects of communication encompassing early language development and later verbal ability, semantic-pragmatic skills, and articulation patterns; difficulties in social understanding and inhibition; and repetitive-stereotyped behaviour. Individual measures were also identified some of which retained predictive power even in the presence of these factors.
We conclude that the evidence from these analyses lend support to the notion that the main traits associated with ASD both theoretically and empirically (social, communication and repetitive behaviours) need to be considered as potentially distinct components of the ASD phenotype, with their own as well as shared genetic and environmental determinants. Equally it needs to be borne in mind, that some of the traits identified here may not be core components of the ASD phenotype but, nevertheless, shape elements of the manifestations of the syndrome.