|Home | About | Journals | Submit | Contact Us | Français|
Animal studies elucidating the neurobiology of fragile X syndrome (FXS) have led to multiple controlled trials in humans, with the Aberrant Behavior Checklist-Community (ABC-C) commonly adopted as a primary outcome measure. A multi-site collaboration examined the psychometric properties of the ABC-C in 630 individuals (ages 3–25) with FXS using exploratory and confirmatory factor analysis. Results support a six-factor structure, with one factor unchanged (Inappropriate Speech), four modified (Irritability, Hyperactivity, Lethargy/Withdrawal, and Stereotypy), and a new Social Avoidance factor. A comparison with ABC-C data from individuals with general intellectual disability and a list of commonly endorsed items are also reported. Reformulated ABC-C scores based on this FXS-specific factor structure may provide added outcome measure specificity and sensitivity in FXS clinical trials.
Fragile X syndrome (FXS) is the leading known cause of both inherited intellectual disability (ID) and autism, and recent estimates of the frequency of the full-mutation are as high as one in every 2,500 births (Hagerman 2008). FXS is caused by the expansion of a trinucleotide repeat sequence, cytosine-guanine-guanine (CGG), in the promoter region of the Fragile X Mental Retardation 1 gene (FMR1) on the long arm of the X chromosome at Xq27.3. Expansions greater than 200 repeats lead to hypermethylation of the gene and absence or significant deficit of the fragile X mental retardation protein (FMRP), an mRNA binding protein that regulates (predominantly inhibiting) the translation of many neuronal messages and is necessary for normal brain development including dendritic arborization and synaptic plasticity (Comery et al. 1997; Devys et al. 1993; Galvez and Greenough 2005; Irwin et al. 2000, 2001, 2002; Jin and Warren 2003). Although ID and autism are the most well-known features of FXS, affected individuals experience a wide range of problems including anxiety, social avoidance, impulsivity and distractibility, hyperactivity, mood lability, repetitive behaviors, self-injury and aggression.
Recent groundbreaking advances from work done in animal models of FXS (fmr1 knockout mouse and dfmr Drosophila mutant) have demonstrated abnormalities in class I metabotropic glutamate (mGluR; Dölen and Bear 2008) and γ-Aminobutyric acid (GABA; D’Hulst and Kooy 2007) receptor signaling, resulting in elevated levels of many key synaptic proteins including matrix metallopro-teinase 9 (MMP9; Bilousova et al. 2009), amyloid precursor protein (APP), PSD95, and Arc in the absense of FMRP expression (for review see Berry-Kravis et al. 2011). In both the mouse and fly models, a number of critical cognitive, behavioral, epileptic, and morphologic (dentritic spine) phenotypes have been ‘‘rescued’’ or normalized with mGluR5 antagonists, GABA agonists, agents that suppress translational signaling (e.g., lithium), cholinergic agents, or minocyline, an MMP9 suppressor. These discoveries have paved the way for treatment aimed at the underlying neurobiology of the disorder in humans, including open label trials of lithium (Berry-Kravis et al. 2008), minocycline (Paribello et al. 2010), and fenobam (Berry-Kravis et al. 2009), and controlled trials of mGluR5 negative modulators (RO4917523, Hoffman-LaRoche, clinicaltrials.gov; AFQ056, Novartis, Jacquemont et al. 2011), a GABA agonist (STX209, Seaside Therapeutics Wang and Hagerman 2010) and the cholinesterase inhibitor, donepezil (NIMH, Kesler et al. 2009).
As a result of its track record in numerous clinical trials in idiopathic ID and autism and precedent as an endpoint for regulatory approval of indications for risperdal and aripiprazole for irritability in autistic spectrum disorders, the Aberrant Behavior Checklist-Community Edition (ABC-C; Aman et al. 1985) has been selected as a primary outcome measure to determine drug efficacy for most of the trials highlighted above. The ABC-C is widely known to have excellent psychometric properties, was initially developed for individuals with ID and demonstrates sensitivity in double-blind placebo controlled trials of risperidone to reduce problematic behaviors, mainly irritability and aggression in autism (Arnold et al. 2003; McCracken et al. 2002; Shea et al. 2004). Other applications have included measurement of disruptive behavior disorders in idiopathic ID (Aman et al. 2002; Van Bellinghen and De Troch 2001), and trials of amantadine and methylphenidate in children with autism and ADHD or irritable behavior (King and Wright 2001; Pearson and Santos 2003). Although the ABC-C holds considerable promise for use in FXS clinical trials, its psychometric properties in this population remain unknown, raising questions about its ability to detect clinically-significant improvement in the aforementioned targeted treatment trials. More generally, detailed examination of the psychometric properties of the ABC-C in this population will be invaluable for more accurate measurement of behavioral phenotype severity in clinical research studies and possibly for assessing and following patients clinically.
Following a recommendation from the Outcome Measures for Clinical Trials in Children with Fragile X Syndrome meeting hosted by the National Institute of Child Health and Human Development in November 2009, we initiated a multi-site collaboration to examine the psychometric properties of the ABC-C in a large population of children, adolescents, and young adults with FXS. The goal of the present study was to use exploratory and confirmatory factor analysis, to provide evidence that would either support the validity of the currently-used ABC-C factor structure or recommend the use of modified subscales based on a FXS-specific factor structure. Additionally, the availability of ABC-C data on a large representative sample of persons with FXS provides normative information that should be useful in clinical assessment or research studies.
The initial collection of archival data included a total of 1,018 completed ABC-C rating forms for individuals with FXS who participated in a variety of research studies at 5 fragile X treatment and research centers in the United States. Following IRB approval at each site, archived ABC-C, IQ score, FMR1 status, medical, and demographic information was de-identified and uploaded to the National Database for Autism Research (NDAR; http://ndar.nih.gov/ndarpublicweb/), a data repository supported by the National Institutes of Health. Use of NDAR allowed the research team to combine de-identified data across centers while preserving the ability to match possible cases seen at multiple centers. Diagnosis of FXS was determined through FMR1 DNA testing as part of the study protocol in which the individual had initially completed the ABC-C. In cases where caregivers had completed more than one ABC-C at a single site (N = 318) or across multiple sites (N = 16) for an individual, only the first occurrence was considered. Age range was restricted to 3–25 years and included forms completed by a parent or guardian only.
The final analytic sample included 630 children and young adults with FXS (459 males, mean age = 11.07 years, SD = 5.33; 171 females, mean age = 11.83 years, SD = 5.00). The ethnic distribution was 71.9% Caucasian, 15.2% Hispanic, 4.4% African American, 1.6% Asian, 1.3% Pacific Islander, 1.3% Multi-Ethnic, and 4.3% unknown/not reported. Five-hundred and twenty-six individuals had FMR1 full mutation alleles with complete methylation (83.5%), and 104 had either repeat size or methylation mosaicism (16.5%). An overview of demographic characteristics for the sample is shown in Table 1.
Close to half the total sample (N = 281) was taking a psychotropic medication at evaluation. Supplemental Table A provides a more detailed description of medication status for the entire sample and by gender. Intelligence testing was available for the majority of participants (N = 511) and included a variety of widely-accepted, validated, and normed instruments, including: McCarthy Scales of Children’s Abilities (.2%; McCarthy 1972), the Wechsler Intelligence Scale for Children (44.4%; Wechsler 2003), the Wechsler Preschool and Primary Scale of Intelligence (3.7%; Wechsler 2006a), the Wechsler Abbreviated Scale of Intelligence (13.7%; Wechsler 1999), the Wechsler Adult Intelligence Scale (8.5%; Wechsler 1997), the Wechsler Nonverbal Scale of Ability (.2%; Wechsler 2006b) the Stanford-Binet Intelligence Scales (19.2%; Roid 2003), the Bayley Scales of Infant Development (3.7%; Bayley 1993), the Mullen Scales of Early Learning (2.7%; Mullen 1995), the Leiter International Performance Scale (.8%; Roid and Miller 1997), the Kaufman Assessment Battery for Children (.4%; Kaufman 1983), and the Differential Abilities Scales (2.3%; Elliott 1997).
Prior to factor analysis of the ABC-C, the sample was divided to produce two stratified random samples: (a) the derivation sample, used to examine the factor structure using exploratory factor analysis (EFA); and (b) a cross-validation sample to replicate or confirm findings from the EFA using confirmatory factor analysis (CFA). Table 1 also contains the results from sub-sample comparisons performed for age, gender, FMR1 status (full mutation or mosaicism), IQ, and initial data collection site.
The ABC-C is a 58-item rating scale used to assess maladaptive behaviors across five original dimensions or subscales: Irritability, Hyperactivity, Lethargy/Withdrawal, Stereotypy, and Inappropriate Speech. Items are evaluated on a four-point Likert scale ranging from 0 (not at all a problem) to 3 (the problem is severe in degree). The first Aberrant Behavior Checklist (ABC-Residential) was developed to measure treatment efficacy among individuals with developmental disabilities living in residential facilities (Aman et al. 1985). The five subscales were empirically derived using factor analysis in a sample of 927 residents of institutions or homes serving individuals with developmental disabilities (65% male, average age = 25.9 years). This original version was later modified and items specific to an institutional setting were revised to apply to a community setting. The factor structure and strong psychometric properties of the ABC-R are preserved in the ABC-C and are robust for both genders and across various ages (Aman et al. 1995; Brown et al. 2002; Marshburn and Aman 1992; Ono 1996).
Typically, exploratory factor analysis is employed when there is weak empirical and theoretical support for a scale’s underlying factor structure. Although various studies have verified the factor structure of the ABC-C in samples with developmental disabilities, the ABC-C factor structure has not been previously examined in the FXS population. We utilized exploratory factor analytic procedures with the derivation sample to examine the dimensionality of the ABC-C in FXS, guide the creation of item parcels (aggregate scores from item-level responses), and explore the possibility of changes in the factor structure and item loadings. Dependent on the results of the EFA, confirmatory factor analysis using parcels with the cross-validation sample allowed for either verification of the original ABC-C factor structure or validation of any changes that might result from the analysis with the derivation sample.
There are several advantages to using parcels in this analysis. Parcels improve the ratio of sample size to the number of estimated parameters, are a stronger and more stable representation of underlying constructs compared to items alone, and reduce the influence of shared unique variance between items from overlapping wording or method effects (Hau and Marsh 2004; Little et al. 2002). Additionally, as is often the case with rating scales used to measure extreme behaviors, the distributions of item-level responses are unlikely to meet the assumptions of normality; however, parcels are more likely to meet these assumptions (Hau and Marsh 2004; Little et al. 2002).
Based on recommended procedures for parcel formation (Kishton and Widaman 1994; Little et al. 2002), we began by examining the dimensionality of the items on the ABC-C with exploratory factory analysis in the R program package (R Development Core Team 2010) using ordinary least squares (OLS) estimation, an iterative method of estimation that is robust against issues of non-normality and categorical items (Norris and Lecavalier 2010). Previous research has demonstrated the underlying constructs of the ABC-C are correlated (Freund and Reiss 1991); therefore, oblique rotation was utilized.
We chose to use several criteria in determining the optimal number of factors to retain. These included two graphical procedures, the scree test (Cattell 1966) and an augmented scree test called parallel analysis (Humphreys and Montanelli 1975), as well as various measures of model fit, in conjunction with considering the conceptual and clinical interpretability of the resulting factors. The scree test is a frequently used method, in which one examines a plot of the eigenvalues from the reduced correlation matrix. The last number of factors before ‘‘the elbow’’ in this generally L-shaped plot is retained. Parallel analysis compares the scree plot produced by the sample with one created from random data. Factors are retained as long as they explain more variance (higher eigenvalues) in the sample data than those resulting from random data (Humphreys and Montanelli 1975). Although the scree test can be somewhat subjective, it is a commonly applied method and is one of the best methods for determining the number of factors to retain, especially when utilized in conjunction with other methods (Floyd and Widaman 1995), such as the less frequently utilized parallel analysis approach and goodness-of-fit criteria.
The Standardized Root Mean Square Residual (SRMR) represents the average amount of unexplained covariation remaining when fitting a factor model to the sample correlation matrix, and values around .08 or less indicate good fit (Hu and Bentler 1999). The Root Mean Square Error of Approximation (RMSEA) indicates the discrepancy between the model produced estimates for the population and those observed in the sample data, while also considering the complexity of the proposed model (Steiger and Lind 1980). Suggested guidelines for the RMSEA are: values less than .05 indicate close fit, between .05 and .08 signify reasonable model fit, between .08 and .10 marginal fit, and values greater than .1 are said to indicate unacceptable fit (Browne and Cudeck 1992). The Tucker-Lewis Index (TLI; Tucker and Lewis 1973) reflects the proportion of explainable covariance accounted for by a factor model. A TLI value of .95 or greater signifies good fit (Hu and Bentler 1999). Finally, the Bayesian Information Criterion (BIC; Schwarz 1978) is a measure of predictive fit, and does not fall on a standard scale. The BIC can be used to compare non-nested models, and smaller BIC values suggest better fit. It is possible to use regularization, by testing various models with increasing numbers of factors and establish a point at which the BIC no longer decreases (or even increases) with additional factors, as an indicator of the number of factors to retain.
The reliability of the factor structure resulting from the EFA was tested in a series of confirmatory factor analyses with data from the cross-validation sample. As mentioned previously, items on rating scales capturing maladaptive behaviors commonly reveal non-normally distributed data and previous studies utilizing the ABC-C have reported finding skewed distributions (Hassiotis et al. 2009; Rojahn et al. 2003). Although the use of parcels is an improvement over items, the multivariate normality assumption for Maximum Likelihood CFA was not met in this sample. Mardia’s test for multivariate normality (Mardia 1970) revealed statistically significant multivariate skewness (Mardia’s coefficient = 43.58; p < .000) and kurtosis (Mardia’s coefficient = 368.60; p<.000). Maximum likelihood estimation can produce distorted results when the multivariate distribution is significantly different from normal. However, the Satorra-Bentler Chi-square statistic (SBχ2; Satorra and Bentler 1994), a rescaled Chi-square statistic that corrects for bias introduced by the amount of kurtosis in the data, has previously demonstrated no evidence of bias when used to examine non-normally distributed data with samples sizes of 200 or more when examined in Monte Carlo simulation studies (Chou et al. 1991; Curran et al. 1996). Accordingly, the SBχ2 was used as part of the Lavaan package (Rosseel 2010) in R to test a series of confirmatory factor analyses with Robust Maximum Likelihood (MLM) estimation.
Multiple methods were implemented for evaluating model fit. In addition to the RMSEA, the SRMR, the TLI, and the BIC we examined the SBχ2 and the test of close fit. The SBχ2 is a goodness-of-fit statistic indicating the ability of the hypothesized model to reproduce the sample correlation matrix. When the SBχ2 is statistically significant, a substantial amount of residual covariation among indicators remains to be explained, perhaps by adding an additional factor. The test of close fit was developed by Browne and Cudeck (1992) to address the overly stringent and sample-sensitive nature of χ2, and is based on the probability of the RMSEA value being less than or equal to .05.
Given the previously demonstrated gender differences in FMR1 protein expression and greater variation in behaviors and intellectual functioning, the best fitting model was tested with males and females separately in the cross-validation sample. We also examined the possible effects of psychotropic medication and age on factor structure. Previous research has shown that changes in the trajectory of cognitive and behavioral functioning begin to emerge between the ages of 11–15 years (Dykens et al. 1989; Hagerman et al. 1994; Hodapp et al. 1990). Considering these changes in addition to significant fluctuations in brain development around this age (Eliez et al. 2001; Giedd et al. 1999; Giedd 2004), the cross-validation sample was divided into two age groups: (a) Children, aged 3–10 years (N = 158), and (b) Adolescents/Young Adults, 11–25 years old (N = 157), and the best fitting model was tested in each group separately. The commonly discussed guideline for sample size is a ratio of 5–10 individuals per estimated parameter (Floyd and Widaman 1995). The sample size to parameter ratio is significantly reduced in a few of these comparisons, specifically the analyses with females only, medications status, and age differences. Accordingly, caution is advised when interpreting these results, though there is support for the scaled Satorra-Bentler Chi-square as robust in small samples (Nevitt and Hancock 2004).
Lastly, in an effort to better understand the behavioral similarities and differences in FXS when compared to the broader population of individuals with intellectual disability, ABC-C scores in the FXS sample were compared with caregiver-reported ABC-C data from a reference group of 601 children and adolescents with ID from a previous study which assessed the factor structure and reported normative data in community sample of individuals in special education (Brown et al. 2002). Kruskal–Wallis tests stratified by gender and age were used to compare the original 5 subscales from the ABC-C between the sample with FXS and the reference group. The two samples were divided into seven age groups for both genders; 6–7, 8–9, 10–11, 12–13, 14–15, 16–17, and 18 years and older. Follow-up tests were conducted to evaluate pairwise differences between the two samples by gender and age for a total of 14 comparisons using Mann–Whitney tests and controlling for the false discovery rate with the Benjamini-Hochberg method (Benjamini and Hochberg 1995).
Initial examination of factor structure revealed a factor made solely of three items reflecting self-injurious behaviors (SIB) from the Irritability subscale (items 2, 50, 52). This factor appears to represent what Cattell (1961) called a ‘‘bloated specific,’’ or very narrow dimension, arising from items with extreme item overlap possibly due to similar item phrasing. Although these items resulted in a seemingly reliable single factor in the initial analysis, its specificity was not consistent with the broader focus of the original five. To correct for this, an SIB parcel was created by summing these three items, and the EFA was run a second time with the SIB parcel in their place.
Based on the scree plot and parallel analysis, we examined the dimensionality of the ABC-C using 5-, 6-, and 7- factor solutions. The seven-factor solution contained one factor with only three items (8, 19 and 41) which appeared to be another ‘‘bloated specific’’ related to disproportionate overlap in item wording (these items all contained ‘‘screams/yells inappropriately’’) and was not utilized. The five- and six-factor solutions appeared to be similar to one another with the exception of an additional new factor containing items 5, 16, 30, and 42, which appears to assess social avoidance. Table 2 contains the item-level factor loadings for the six-factor solution. Underlined factor loadings represent items that shifted to a different or new factor compared to the original ABC-C.
From the six-factor solution, Factors I, V and VI appear to correspond, respectively, to the Irritability, Stereotypy, and Inappropriate Speech subscales from the original ABC-C factor structure and together contain 92% of their original items. However, Factor I, ‘‘Irritability,’’ contained, in addition to its original items, items 7, 18, 21, 24 from the original Hyperactivity subscale. The majority of the remaining items from the Hyperactivity subscale loaded on Factor II, with the exception of 3 items that appeared to reflect inattention (28, 51, 56) loading on Factor III. Additionally, Factor III contained most of the items from the Lethargy/Withdrawal subscale and item 25 from the Irritability subscale. In all, the third factor seems to be an indicator of both lethargy and behaviors reflecting a lack of awareness or responsiveness in social situations. Factor IV contained 4 items (5, 16, 30, 42) that were originally on the Lethargy/Withdrawal subscale and appear to reflect social avoidance. Items 3, 26 and 27, had weak loadings of only .32, .29, and .30, respectively, and were endorsed by less than a quarter of the total sample. Consequently, these items were dropped from subsequent analyses.
Results from the item level EFA revealed two general discrepancies in the original ABC-C factor structure: (a) ten items had shifted to a new factor or had high dual factor loadings, and (b) four items appeared to represent a new sixth factor. To address these issues, a second EFA with parcels was implemented. Items that remained stable on their original subscales were used to create two or three parcels for each factor with 2–5 items per parcel. To account for the multidimensional nature of the ABC-C, a domain-representative approach to the formation of parcels was employed (Little et al. 2002). Items representing related dimensions of the same factor were equally distributed across the parcels signifying that construct, consequently allowing each parcel to be representative of both the common and unique facets of the dimensions. Items with dual loadings (items 31 and 34) or items loading on a different factor were included individually in a final EFA using the same methods from the item-level analysis (OLS estimation, promax rotation). This approach allowed the original subscales to have a stronger more stable representation in the analyses and was a more stringent test of the changes observed in the item level analysis. We were able to observe whether these items had shifted because of idiosyncrasies/correlated unique variance associated with the item or if they now truly represented an altered latent construct in our sample.
We reviewed the five-, six- and seven-factor solutions. If the new sixth factor, Social Avoidance, was truly part of the original Lethargy/Withdrawal subscale, we would expect these items to load with the Lethargy parcels in the five-factor solution. However, in the five-factor solution, the Inappropriate Speech parcels loaded with the Hyper-activity parcels, and the Social Avoidance items remained their own factor. In the seven-factor solution, only two items (25 and 34) loaded on the seventh factor. Examination of the six-factor solution provided a clear, theoretically meaningful, and clinically interpretable factor solution. Additionally, the scree plot and parallel analysis, as well as all fit statistics, reported in Table 3, empirically support the six-factor solution. Items 28, 31, and 34, subsequently returned to their original subscale. However, the remaining items continued to load on a different factor, or in the case of the socially avoidant items, created their own new factor, illustrated in Supplemental Table B. The parcel-item EFA further supported our previous EFA results. The remaining individual items were distributed amongst the corresponding parcels: items 7, 18, 21, 24, and 34 with the Irritability parcels, items 28 and 31 with the Hyperactivity parcels, and items 25, 51, and 56 with Lethargy parcels. Finally, items 5, 16, 30, and 42, were used to create two Social Avoidance parcels for the CFA with the cross-validation sample.
Table 4 summarizes the indices of fit for the CFA with the updated domain representative parcels for a one-, five-, and six-factor solutions. Social Avoidance parcels loaded on the Lethargy/Withdrawal Factor in the five-factor solution and defined their own factor in the six-factor solution, illustrated in Fig. 1. The RMSEA for the five-factor solution was greater than .10, indicating unacceptable fit for this model, and did not pass the test of close fit. Furthermore, the TLI was only .88 (not close to the suggested value for good fit, .95), though a value of .07 for the SRMR falls within the range of acceptable fit. For the six-factor model, the SRMR indicated a large decrease in average standardized residuals and thus a better fit. Although the SBχ2 (89, N = 315) = 145.62, p<.001, which tends to be positively biased, was significant for the six-factor solution, suggesting a lack of fit of the model to the data, the remaining fit indices suggested a close fit between the six-factor model and the sample data. Specifically, both the RMSEA and TLI are well within the range for good fit, and this model passed the test of close fit. Furthermore, the SB- corrected change in Chi-square indicated a significantly better fit (SB scaled difference in χ2 (5) = 199.67, p <.001), from the five-factor to the six-factor model.
Inspection of the modification indices suggested significant improvement of the Chi-square statistic when allowing the unique factors for the Stereotypy 01 and Stereotypy 03 parcels to covary. Examination of the items in these two parcels revealed excessive overlap in items related to repetitive movements (items 11 and 35) and shaking/rocking (items 45 and 49) between these two, compared to the third Stereotypy parcel. As is commonly recommended, this modification was implemented after consideration of this additional theoretical support (Brown 2006). Freeing this pathway provided a significant improvement in model fit (SB scaled difference in χ2 (1) = 15.04, p< .001).
The newly derived subscales demonstrated good internal consistency, with alpha coefficients of .94 for Irritability, .92 for Hyperactivity, .86 for Socially Unresponsive/Lethargic, .92 for Social Avoidance, .87 for Stereotypy, and .80 for Inappropriate Speech. The alpha coefficient is the most frequently reported measure of internal consistency, however, it may be more informative to consider coefficient omega (Zinbarg et al. 2005). In a scale like the ABC-C, where the factors tend to reflect more multidimensional constructs, this coefficient may offer a clearer understanding of the influence a common factor has on a subscale. The six subscale scores had omega coefficients of .96 for Irritability, .94 for Hyperactivity, .87 for Socially Unresponsive/Lethargic, .93 for Social Avoidance, .89 for Stereotypy, .88 for Inappropriate Speech, and .83 for the whole scale.
After determining the superior fit and strong internal consistency of the six-factor model, we then tested the model separately in males, females, individuals with and without psychotropic medication use, and by age. The indices of fit for these analyses are also listed in Table 4. The 90% CI for the RMSEA for all groups spans a larger range, reflecting the influence of the smaller sample sizes and less precise estimation in these groups, but the RMSEA for all groups was about .05, and all six-factor models passed the test for close fit. Furthermore, both the value of the TLI and SRMR indicate good fit for all groups. In summary, though the small samples sizes in these groups may make these results less reliable, it provides support for the six-factor structure in various homogenous samples of individuals diagnosed with FXS. Table 5 lists the mean and standard deviation of the original and new subscale scores by age and gender. Finally, Supplemental Table C lists the percentage of participants whose caregivers endorsed each item on the ABC-C for the total sample, by gender, and by gender and age.
Kruskal–Wallis tests, stratified by gender and age, comparing the original 5 subscales from the ABC-C between the sample with FXS and the reference group (differences indicated in Table 5) revealed significant overall group differences on four of the five subscales: Lethargy/Withdrawal, χ2 (1, N = 1118) = 11.27, p< .001; Stereotypy, χ2 (1, N = 1118) = 55.56, p < .001; Hyperactivity, χ2 (1, N = 1118) = 13.38, p <.001; and Inappropriate Speech, χ2 (1, N = 1,118) = 111.57, p < .001. Results from pair-wise comparisons using Mann–Whitney tests indicated significant differences between the sample with FXS and the reference group on the Irritability subscale for males 12 and 13 years old; the Lethargy/Withdrawal subscale for males 10–13 and 16–17 years; the Stereotypy subscale for males 8–18 years and females 10–11 years; the Hyperactivity subscale for males 6–9, 12–13, and 18–25 years and females 10–11 years; and the Inappropriate Speech sub-scale for males 6–25 years and females 10–11 years old. However, after correcting for multiple comparisons with the Benjamini-Hochberg method, only males 8–15 and 18–25 years on the Stereotypy subscale, males 12–13 years on the Hyperactivity subscale, and all age groups for males and females 10–11 years old on the Inappropriate Speech subscale continued to demonstrate a significant difference. The sample with FXS had a greater portion of individuals with elevated subscale scores compared to the reference group in all comparison where a significant difference was discovered.
The results of the present study provide support for a modified factor structure of the Aberrant Behavior Checklist-Community Edition among individuals with FXS. These findings appear to have immediate implications for the interpretation of outcome data in several ongoing and planned FXS treatment trials, and may be useful in the clinical characterization of patients in research studies and clinical settings. It is possible that the modified subscales derived from the present study may be more representative of some aspects of the fragile X phenotype and therefore could be more sensitive to interventions aiming to normalize the neurobiology of the disorder.
In general, the Inappropriate Speech and Stereotypy factors derived in this sample are consistent with these subscales from the original ABC-C and were not modified significantly. However, we found substantial changes in the Irritability, Hyperactivity, and Lethargy/Withdrawal sub-scales, as well as the emergence of a new factor. Specifically, the Hyperactivity subscale was reduced from sixteen to nine items. This multidimensional subscale originally contained items relating to excessive activity, impulsivity, disruptive behaviors, inattentiveness, and distractibility. For the current sample, the revised Hyperactivity subscale is more explicitly related to elevated activity. Items relating to disruptive behaviors shifted to the revised Irritability subscale. Because hyperactive and disinhibited behaviors are considered ‘‘core’’ aspects of the FXS phenotype (Farzin et al. 2006; Lachiewicz and Dawson 2005; Menon et al. 2004; Munir et al. 2000; Sullivan et al. 2006), this subscale alteration appears valid and may provide added measurement specificity for this important dimension.
The modified Irritability subscale contains items from its original derivation (with the exception of item 25, depressed mood) and the disruptive behavioral items from the original Hyperactivity subscale. Previous studies on other groups with neurodevelopmental disorders have reported similar findings, with these two subscales collapsing into one disruptive behavior factor (Brinkley et al. 2007; Marshburn and Aman 1992; Newton and Sturmey 1988). In the initial stages of our analyses, items reflecting self-injurious behaviors emerged as their own factor; however, after parceling these items together, they proceeded to load on their original Irritability factor. Examination of the ABC-C factor structure in a sample of individuals with autism spectrum disorder (ASD) revealed similar results, with one factor consisting solely of the three SIB items (Brinkley et al. 2007). When researchers further examined the SIB items, they discovered differing factor structures dependent on the level of SIB reported. Specifically, the original factor structure was not validated in the high self-injurious sub-group. Previous research on the presentation of SIB among individuals with FXS has lead researchers to propose that differing motivations underlie this behavior, specifically that these behaviors are distinct from irritable or aggressive behaviors, instead having a strong social component related difficulties in arousal regulation (Hall et al. 2008; Hessl et al. 2008; Symons et al. 2003). SIB is an important aspect of the FXS phenotype and warrants broader and more diverse assessment than is provided by the three ABC-C items.
Perhaps the most significant modification in the ABC-C factor structure in this sample is the emergence of a sixth factor, referred to as Social Avoidance. This subscale includes items such as ‘‘withdrawn, prefers solitary activities’’ and ‘‘seeks isolation from others’’, which were originally part of the Lethargy/Withdrawal subscale. Although this scale contains just 4 items, it appears to capture core aspects of the FXS phenotype related to gaze avoidance, social ‘‘escape’’ behaviors, and social anxiety (Budimirovic et al. 2006; Cordeiro et al. 2010; Farzin et al. 2009, 2011; Garrett et al. 2004; Hall et al. 2006; Hessl et al. 2006; Watson et al. 2008).
The revised Socially Unresponsive/Lethargic subscale is comprised of the remaining items of the original Lethargy/Withdrawal subscale including features associated with lack of social awareness and response, along with items relating to inattention from the original Hyperactivity subscale. The distinction between lack of social responsiveness and social avoidance is not entirely unexpected. Marshburn and Aman (1992) found a similar structure when they examined a 6-factor solution of the ABC-C in a community sample of children with ID. The separation of these socially-related items into two separate domains may suggest a feature that is uniquely applicable to FXS. Previous research comparing the socially-related behavior profiles of boys with FXS and FXS with comorbid ASD reported group distinctions based on scores from the ABC-C (Kau et al. 2004; Kaufmann et al. 2004). Based on previous research and clinical experience, Budimirovic et al. (2006) sorted the original Lethargy/Withdrawal subscale items into groups that reflected active social avoidance and/or social indifference, with item assignment very closely mirroring results in our analyses. Specifically, items 5 and 30 were considered to represent mainly social avoidance, item 40 to represent social indifference, and items 16 and 42 were a combination of social avoidance and social indifference. Furthermore, scores on these five items (5, 16, 30, 40, 42) identified clear differences between the two FXS groups (with and without comorbid ASD), and were predictors of ASD diagnosis, whereas the remaining items from the ABC-C Lethargy/Withdrawal subscale did not.
Comparisons with the reference sample from Brown et al. (2002) revealed that a significant portion of individuals with FXS demonstrated higher levels of stereotypic behaviors and inappropriate speech. These findings support previous research in both behavioral domains and reflect features commonly found in ASD and FXS. Studies examining language development have reported that individuals with FXS produce more perseverative or self-repetitive speech, especially males, than developmental level-matched individuals with Down syndrome or ASD (Ferrier et al. 1991; Sudhalter et al. 1990) and idiopathic ID (Sudhalter and Belser 2001).
Although we did not explicitly measure test–retest reliability as part of the current study, prior work has reported good to excellent stability of the ABC-C in FXS. Berry-Kravis et al. (2006) used the ABC-C as a secondary outcome measure to assess the efficacy of the ampakine compound CX516 in 4-week randomized, double-blind, placebo-controlled clinical trial in adults diagnosed with FXS. They also used data from the ABC-C completed by the placebo group (N = 25) to test the stability of the five original scales between baseline and 5 weeks later. All subscales had good test–retest reliability (ICC = .8–.9), with the exception of the Inappropriate Speech subscale which demonstrated moderate reliability (ICC = .6).
Future studies should further explore the validity of the original and newly formed subscales against other established behavioral measurement tools and direct observation of similar behaviors. Critically, the sensitivity of the currently described factors in the context of ongoing and planned FXS clinical trials can be examined, with the potential of yielding data to more accurately capture behavioral domain-specific improvements in functioning. Indeed, true clinical improvement associated with treatment may not be detected by the original ABC-C sub-scales, but may be captured more readily with subscales reported here. Although the current results may lead to some significant advantages in the use of the ABC-C in clinical studies, they do not fully capture all aspects of the FXS phenotype. For example, the items on the ABC-C were selected to measure behaviors of patients with ID and therefore may not assess key symptoms in higher-functioning individuals with FXS. Other than possible deficits in accurately capturing FXS specific phenotypic SIB and social behaviors, the ABC-C also lacks items to measure most key symptoms of anxiety, which are pervasive and impairing in this syndrome (Cordeiro et al. 2010). As such, the current findings provide critical preliminary data to inform the design of new behavioral outcome measures specifically designed to capture the FXS phenotype and response to intervention.
As a result of the retrospective design, the study had several important limitations. First, a high proportion of participants were taking psychoactive medications at the time of ABC-C assessment, treatments that are likely to have altered ratings of behavior. Similarly, we were not able to control for behavioral or educational interventions, which could also alter ratings on the ABC-C. However, the exclusion of individuals that had previously or were currently undergoing treatment would have significantly limited the sample size and resulted in a biased sample of higher functioning individuals, which would not represent the overall population of affected individuals. Second, the ethnic and racial distribution of the sample was predominantly Caucasian, limiting generalization of findings to other groups. Additionally, behavioral data from other instruments and relevant clinical information, such as autism status, were not available, preventing comprehensive validity studies. For example, we were not able to establish whether higher Stereotypy or Social Avoidance scores on the ABC-C were necessarily reflective of a comorbid diagnosis of autism. Sample sizes for analyses within the more homogenous groups—those based on gender, medication status, and age—were rather small. As a result, although these analyses supported the validity of the six-factor solution in all groups, the resulting parameter estimates were less reliable and should be viewed as preliminary. Due to the nature of the archival data, we were unable to fully control the characteristics of the study sample and the reference group, such as level of intellectual functioning, ethnicity and socio-economic factors. Finally, performance of the scale was not assessed for individuals with FXS over 25 years of age. Future research should evaluate whether the factor structure of the ABC-C is similar in this older group, as many individuals over 25 are currently participating in the above-described clinical trials. Future studies with larger samples from a broader age range examining the impact of gender, age, medication, and psychiatric diagnosis are warranted. Finally, examination of the ABC-C factor structure in other specific populations with ID, for example Down Syndrome or Fetal Alcohol Syndrome, could be useful for proper behavioral characterization and accurately capturing response to treatment. We hope this research and the methods outlined in this study will be helpful in these types of investigations.
We are especially grateful for the participants and their families for their contribution to the understanding of fragile X syndrome. We would also like to thank Lisa Cordeiro, Elizabeth Ballinger, Alyssa Chavez, Ava Rezvani, Lia Boyle, Victor Talisa, Nana Asante, Crystal Hervey, and Anna DeSonia for their help with site coordination, data entry and management. This work was supported by NIH K23MH77554 (Hessl), the National Fragile X Foundation (Hessl), FRAXA (Hessl), NIH K08MH081998 (Hall), R01MH050047 (Reiss), and NIH HD24061 (Kaufmann).
A portion of the data was presented at the 44th Annual Gatlinburg Conference, San Antonio, TX, in March 2010.
Electronic supplementary material The online version of this article (doi: 10.1007/s10803-011-1370-2) contains supplementary material, which is available to authorized users.
Stephanie M. Sansone, Medical Investigation of Neurodevelopmental Disorders (MIND) Institute, University of California Davis Medical Center, 2825 50th Street, Sacramento, CA 95817, USA.
Keith F. Widaman, Department of Psychology, University of California, Davis, One Shields Avenue, Davis, CA, USA.
Scott S. Hall, Department of Psychiatry and Behavioral Sciences, Center for Interdisciplinary Brain Sciences Research, Stanford University, 401 Quarry Road, Stanford, CA, USA.
Allan L. Reiss, Department of Psychiatry and Behavioral Sciences, Center for Interdisciplinary Brain Sciences Research, Stanford University, 401 Quarry Road, Stanford, CA, USA.
Amy Lightbody, Department of Psychiatry and Behavioral Sciences, Center for Interdisciplinary Brain Sciences Research, Stanford University, 401 Quarry Road, Stanford, CA, USA.
Walter E. Kaufmann, Center for Genetic Disorders of Cognition and Behavior, Kennedy Krieger Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
Elizabeth Berry-Kravis, Departments of Pediatrics, Neurological Sciences, Biochemistry, Rush University Medical Center, Chicago, IL, USA.
Ave Lachiewicz, Departments of Pediatrics and Psychiatry and Behavioral Sciences, Duke University Medical Center, Durham, NC, USA. Private Practice, Reno, NV, USA.
Elaine C. Brown, Private Practice, Reno, NV, USA.
David Hessl, Medical Investigation of Neurodevelopmental Disorders (MIND) Institute, University of California Davis Medical Center, 2825 50th Street, Sacramento, CA 95817, USA. Department of Psychiatry and Behavioral Sciences, University of California, Davis, School of Medicine, Sacramento, CA, USA.