|Home | About | Journals | Submit | Contact Us | Français|
The identification of autism susceptibility genes has been hampered by phenotypic heterogeneity of autism, among other factors. However, the use of endophenotypes has shown preliminary success in reducing heterogeneity and identifying potential autism-related susceptibility regions. To further explore the utility of using language related endophenotypes, we performed linkage analysis on multiplex autism families stratified according to delayed expressive speech and also assessed the extent to which parental phenotype information would aid in identifying regions of linkage. A whole genome scan using a multipoint nonparametric linkage approach was performed in 133 families, stratifying the sample by phrase speech delay and word delay. None of the regions reached suggested genome-wide or replication significance thresholds. However, several loci on chromosomes 1, 2, 4, 6, 7, 8, 9, 10, 12, 15, and 19 yielded nominally higher linkage signals in the delayed groups. The results did not support reported linkage findings for loci on chromosomes 7 or 13 that were a result of stratification based on the language delay endophenotype. In addition, inclusion of information on parental history of language delay did not appreciably affect the linkage results. The nominal increase in NPL scores across several regions using language delay endophenotypes for stratification suggests that this strategy may be useful in attenuating heterogeneity. However, the inconsistencies in regions identified across studies highlight the importance of increasing sample sizes to provide adequate power to test replications in independent samples.
Autism and the related autism spectrum disorders (ASD) are complex neurodevelopmental disorders characterized by core deficits in three major domains: social interaction and social relatedness, verbal and non-verbal communication, and restricted interests and/or repetitive or stereotyped behaviors and resistance to change. The expression of the deficits encompasses a wide continuum extending from mild peculiarities to severe developmental disabilities. There is strong evidence from two major lines of investigation that the genetic contribution to ASD is substantial (Cook 2001; Folstein and Rosen-Sheidley 2001). First, indirect evidence comes from the high incidence of neurogenetic disorders and chromosomal anomalies occurring in 5–9% of autism patients (Lewis et al. 1995; Fombonne et al. 1997; Cook 2001; Wassink et al. 2001). Second, twin and family studies provide direct evidence of a genetic etiology in idiopathic autism (Folstein and Rutter 1977; Ritvo et al. 1985; Steffenburg et al. 1989; Ritvo et al. 1991; Bailey et al. 1995; Le Couteur et al. 1996).
Although heritability estimates for ASD range from 60 to 90% (Folstein and Rutter 1977; Ritvo et al. 1985) placing it among the most heritable of complex neuropsychiatric conditions; the identification of candidate loci for the disorder has been complicated by genetic and phenotypic heterogeneity. Results from the nine published whole genome scans using autism as a qualitative phenotype (IMGSAC 1998; Barrett et al. 1999; Philippe et al. 1999; Risch et al. 1999; IMGSAC 2001; Liu et al. 2001; Auranen et al. 2002; Shao et al. 2002b; Yonan et al. 2003) have been variable with the most consistent findings on chromosome 7 (Badner and Gershon 2002). The other regions of interest that have shown strong linkage signals and/or have support from multiple studies include: 2q (Philippe et al. 1999; IMGSAC 2001; Buxbaum et al. 2002; Shao et al. 2002b), 4 (IMGSAC 1998; Barrett et al. 1999; Yonan et al. 2003), 13 (Barrett et al. 1999), 17p (Risch et al. 1999; IMGSAC 2001; Liu et al. 2001; Yonan et al. 2003; Stone et al. 2004; Cantor et al. 2005) and X (Auranen et al. 2002; Shao et al. 2002b).
Investigators have begun to use endophenotypes related to autism in an attempt to reduce heterogeneity and identify factors that may relate more closely to genetic etiologies than the current broad diagnostic categories. Endophenotypes are components of a more complex phenotype, such as behavioral, cognitive, morphologic or biochemical features that may be more directly related to the underlying genetic etiologies (Gottesman and Gould 2003). Using behavioral endophenotypes such as insistence on sameness, obsessive-compulsive behavior or savant skills to stratify ASD families in linkage analysis has shown promise (Nurmi et al. 2003; Shao et al. 2003; Buxbaum et al. 2004; McCauley et al. 2004). The most significant linkage result reported by our group was based on stratifying families by the sex of the autistic proband; this revealed a locus with genome-wide significance on chromosome 17 for the families with only male autistic probands (Stone et al. 2004) and this locus was recently replicated, with genome-wide significance, in an independent sample (Cantor et al. 2005). Stratification of families by the proband’s language delay has also proved a useful linkage strategy, producing strengthened signals on chromosomes 2q (Buxbaum et al. 2001; Shao et al. 2002a), 7q (Bradford et al. 2001) and 13q (Bradford et al. 2001). In addition to stratifying families based on proband phrase speech delay, Bradford et al. (2001) also considered parents with language delay as affected and reported strengthened linkage signals to autism on chromosomes 7 and 13. In an alternative approach, use of a language endophenotype in a quantitative linkage analysis highlighted a region on chromosome 7q that was not present in a previous qualitative scan of the same data from the Autism Genetic Resource Exchange (AGRE; Alarcón et al. 2002; Liu et al. 2001).
Given these varied findings, replication is critical to assess the validity of reported linkage peaks in complex neuropsychiatric disorders. Based on the success of previous reports (Bradford et al. 2001; Buxbaum et al. 2001; Shao et al. 2002a) we stratified families from the AGRE sample by the presence of expressive speech delay and included parental information regarding a history of language delays or deficits in order to categorize parents as affected in the linkage analysis. Rather than limit our investigation to previously reported chromosomal regions with only modest evidence for linkage, we performed a whole genome scan.
Families were obtained from the Autism Genetics Resource Exchange (AGRE) program. AGRE is a large shared database containing phenotype and genotype information of ASD families available to approved researchers (Geschwind et al. 2001). Families were ascertained based on the criteria that there were at least two siblings with a reported ASD diagnosis. Diagnoses are confirmed by the Autism Diagnosis Interview-Revised (ADI-R) (Lord et al. 1994). The present analysis used the broad-spectrum diagnosis for probands which included those with autistic disorder and those with similar but lesser deficits in communication, social relatedness and/or repetitive behaviors/restricted interests. Details on diagnostic algorithms are available on the AGRE website (www.agre.org).
Phenotypic assessment included the diagnostic testing using the ADI-R (Lord et al. 1994) and the Autism Diagnostic Observation Schedule (ADOS-G) ((Lord et al. 2000) and cognitive testing using the Raven Progressive Matrices – Coloured Version (Raven 1956) and the Peabody Picture Vocabulary Test (PPVT-III) (Dunn and Dunn 1981) which can be used as surrogate non-verbal and verbal IQ measures, respectively.
Families for this analysis were selected from the 345 genotyped families based on the availability of more extensive parental language data. MZ twins and individuals with known chromosomal abnormalities were excluded. This sample included 133 nuclear families with a total of 634 individuals, 131 fathers, 133 mothers, 89 unaffected siblings (40 males, 49 females), and 280 probands (221 males, 59 females).
Delay in spoken language acquisition is frequently seen in autism but not required for the DSM-IV diagnosis of an autism spectrum disorder. We stratified the families based on proband phrase and word speech delay as reported on the ADI-R items: A12 or “age at first word” and A13 or “age at first phrase”. Phrase Delay (PD) families included two or more autistic or broad-spectrum probands with phrase speech acquired after 36 months of age (n=69). Word Delay (WD) families included two or more autistic or broad-spectrum probands with single words acquired after 18 months of age (n=60). We defined word delay as the acquisition of single spoken words past 18 months of age because this is the standard cut-off used in the clinical practice of neurology and is, thus, widely regarded as clinically meaningful. Probands who had not yet acquired the single words or phrases were considered delayed for purposes of this analysis.
Parental affectation status was based on a history of language development questionnaire adapted from those used for family studies of specific language impairment (Tomblin 1989; Tomblin et al. 1992) and dyslexia (Lefly and Pennington 2000). If the parent had a self-reported history of delayed speech and language or had difficulty learning to read, the individual was considered affected similar to the approach used by Bradford and colleagues (Bradford et al. 2001).
Laboratory and genotyping procedures have been previously detailed (Liu et al. 2001; Yonan et al. 2003). Briefly, 10–20 microliters of blood were collected from all available family members (parents, affected children, unaffected children) in the family home. Samples were shipped to Rutgers University Repository where immortalized lymphoblast cell lines were created and DNA was extracted for storage and distribution to approved AGRE researchers. DNA was genotyped at the Columbia University Genome Center using 365 DNA microsatellite markers. These included 335 markers used in the original genome scan (Liu et al. 2001) at an average density of 10cM, as well as an additional 30 fine map microsatellites on chromosome 7q with an average density of 2 cM. The markers had an average heterozygosity of 0.77.
The SAS analysis software package (SAS/STAT 1990) was used to calculate descriptive statistics of the sample and to prepare input files for genetic analysis. Mendelian genotype errors were queried with PedCheck (O’Connell and Weeks 1998) and genotype error was detected in <0.01%.
A multipoint nonparametric linkage analysis was performed using the Genehunter program (Kruglyak et al. 1996) using the broad-spectrum diagnosis. Four analyses were performed: 1) all families (ALL; n=133); 2) phrase delayed (PD) families (n=69); 3) word delayed (WD) families (n=60); and 4) all families including parental information (ALL with PI; n=133). Given that none of the linkage peaks reached criteria for genome-wide significance (i.e. p = 2 × 10−5 based on Lander and Kruglyak, 1995), we reported only loci with nominal p values ≤ 0.05, as well as those corresponding results from the ALL group for comparison purposes.
We stratified the families based on proband phrase and word speech delay to attempt to replicate previous findings from other smaller cohorts of multiplex autism pedigrees. We also performed the linkage scan in the entire cohort to use as a comparison to determine whether stratification actually increased the NPL score. None of the linkage results of the non-parametric multipoint scan (Methods) were significant at a genome-wide level for any of the cohorts (ALL, PD or WD). Moreover, the results on chr 2, 7 and 13 did not meet the suggested threshold for replication (Lander and Kruglyak, 1995). However, as shown in Table I, several peaks reached nominal significance (p<.05). Loci on chromosomes 1, 2, 4, 6, 7, 8, 9, 10, 12, 15 and 19 showed a modest increase in linkage scores either when stratified by PD or WD, and several of these (chr 1, 10, 12, 15, 19) had NPL scores greater than 2.2 yielding p values ≤.01 (Table I). Loci on chromosome 1, 15 and 19 had NPL scores with p values ≤.05 using either language-related phenotypes, PD or WD.
Similar to previous studies, stratification by Phrase Delay did appear to strengthen the linkage signals on chromosome 2 (Buxbaum et al. 2001; Shao et al. 2002a). In our sample this was achieved at four loci, one of which is in the region (peak= 188 cM) identified by both the Buxbaum (peak=186 cM) and Shao groups (peak=198 cM). Although our results could be considered in support of Buxbaum’s chromosome 2 peak, after removal of the AGRE families that overlapped both studies (n=36 total, 16 in PD group) the peak at this locus disappeared in the Phrase Delayed group (NPL=1.03, p>.05). In contrast to a previous report (Bradford et al. 2001) stratification by phrase or word delay did not strengthen the signal compared to the entire sample on chromosomes 7 or 13. In fact, our data did not show a peak in the same regions on either of those chromosomes in any group.
Inclusion of parental language development information had a negligible effect on the linkage scores and, thus, these results are not presented. However, we examined the transmission of this trait in the nuclear families and observed that language-deficit information obtained retrospectively from the parents did not co-segregate with proband language delay as measured by the ADI in this cohort. The percentage of parents with language problems tended to be higher in the sets of families without proband language delays, although the differences were not significant. Specifically 13.5% of the 74 PD families were parents affected with language problems whereas 20.5% of the 214 NPD families were affected (X2=1.79 ns). A similar pattern was observed when stratifying families with word delay: 17% of the 123 WD families were affected parents versus 20% of the 165 NWD families (X2=.4 ns). Thus, it is not surprising that the inclusion of parental report information did not strengthen linkage signals.
To better characterize the subgroups and determine if there was some other feature co-segregating with the delayed groups that could contribute to the modestly strengthened linkage signals, the delayed and non delayed groups were compared on mean scores from the cognitive, language and behavioral testing (see Table IIa and andb).b). The multiple statistical comparisons performed required an adjusted significance level of p<.0007; the resulting between-group significance values did not exceed this threshold.
We performed a nonparametric genome scan on a subset of 133 families from the AGRE sample with extended phenotypic information to test the approach used in previous studies to reduce heterogeneity and strengthen linkage signals by stratification based on proband language delay endophenotypes (Buxbaum et al. 2001; Shao et al. 2002a). We also incorporated information from parents of probands with reported language difficulties to determine whether this would similarly strengthen linkage signals as shown by Bradford et al. (Bradford et al. 2001).
Consistent with results from previous genome scans in independent samples, the present linkage results from the 133 AGRE families with language information identified regions on chromosomes 4 (Barrett et al., 1999), 10 (IMGSAC, 1998); 11 (Barrett et al., 1999), 16 (IMGSAC, 1998 & 2001), and 17 (IMGSAC, 2001) yielding NPL scores with p values of ≤.05.
As shown previously, stratification of families based on proband phrase or word delay did appear to strengthen linkage signals in a number of regions. However, none met suggested criteria for genome-wide significance or for replication (Lander and Kruglyak, 1995). A region on chromosome 2q was of particular interest as an autism susceptibility region for two reasons: 1) it had been previously reported by two groups (Buxbaum et al. 2001; Shao et al. 2002a) to have increased evidence for linkage in language-delayed families; and 2) there is an association between autism and a mitochondrial gene in this region (Ramoz et al. 2004). However, when the AGRE families included in the Buxbaum et al. analysis were removed from the current study, the peak in that region was diminished. This suggested that the same group of families contributed to the linkage signal on 2q in both studies. Thus, the present study does not provide additional support for linkage in the 2q region despite having a larger number of families than the Buxbaum report.
Our results from the stratified families also showed a nominal peak on chromosome 15q at 12.3cM which is within a region containing the GABRB3 gene (15q11-13, 9-23cM) that has been variably associated with autism (Cook et al. 1998; Maestrini et al. 1999; Buxbaum et al. 2002), with endophenotypes related to insistence on sameness (Shao et al. 2003) and savant skills (Nurmi et al. 2003), but not with language delay.
We did not observe any strengthening of the signal in the region of chromosome 7q identified by Bradford et al. (Bradford et al. 2001). Our group has demonstrated suggestive linkage to this region on chromosome 7q in AGRE using a quantitative trait locus (QTL) approach based on age at first word (Alarcón et al. 2002). Further analysis of this region has demonstrated that the linkage peak may not be related to the magnitude of language delay per se, but rather to a more general language-related susceptibility trait (Alarcón et al. 2005). Thus, not finding an effect of stratification based on delay in the AGRE sample would not be surprising. However, we did find a region at the telomere on chromosome 7 in the phrase delayed families as well as very modest evidence (p<.05) for linkage on chromosomes 1, 2, 4, 6, 8, 9, 10, 12, and 19 in areas that have not previously been reported as linked to autism and will need to be further studied in an independent sample.
We also examined the utility of incorporating historical information regarding parental language difficulties into the linkage analysis. Bradford et al. (Bradford et al. 2001) hypothesized that extending the specific endophenotype of language delay to other family members should increase the signal at any locus related to that phenotype. Expecting that the majority of the increased signal would come from families who also had probands with speech delay, they also stratified on phrase speech delay in a manner similar to our analysis. Their results on chromosome 7 and 13 showed a modest effect of inclusion of parent information, made stronger by combining with stratification based on the presence of proband phrase speech delay. Results of the present study did not support those described in Bradford’s report.
The lack of support observed in this study of previous linkage peaks strengthened by stratification and/or inclusion of parental information could be explained in a number of ways. First, none of the linkage results in the previous stratification studies reached genome-wide significance and therefore may have represented spurious results based on sub-setting of the cohort into smaller groups. Similarly, although the results for the Bradford et al. study (2001) that explored the utility of including parental information in the linkage analysis was based on 50 families, they still did not reach significance for genome-wide scans (Lander and Kruglyak 1995) and may represent spurious findings. Second, the present results may reflect a lack of power due to the small sample size that was a consequence of including only the subset of the complete AGRE sample with available parental language information. Third, in this cohort, it is also possible that the language deficits in parents are not related to the specific delay endophenotype in their children, and, thus, inclusion of parental information in the present linkage analysis increased heterogeneity rather than reducing it. This is supported by the lack of parent-child clustering of language delay within families. Finally, although the questions and methods used to obtain these data were similar to those used in other studies, it is possible that the self-report language history information we obtained from parents is not reliable.
This study provides an example of the difficulty phenotypic heterogeneity poses for the identification of autism susceptibility genes. Furthermore, the fact that different studies all using similar language-related endophenotypes yield linkage to different loci implies that even the language delay component of ASD could be genetically heterogeneous. For instance, a child could present with speech delay secondary to a true expressive language deficit (which may also be accompanied by a receptive language deficit), or due to a more specific motor speech disorder (e.g. speech apraxia). These different underlying pathologies could have distinct underlying genetic etiologies leading to the identification of a variety of loci in linkage analyses. Alternatively, this trait could co-segregate with another trait (so far undetected) that truly underlies linkage to the chromosomal regions identified. Better definition of the language endophenotype and perhaps further subgrouping of much larger samples into more homogeneous groups based on related aspects of the phenotype (e.g. receptive and expressive speech delay or speech apraxia) will be necessary to further explore this concept.
This work was supported by grants from the M.I.N.D. Institute to SJS and MA, and from the NARSAD to MA. We gratefully acknowledge the resources provided by the Autism Genetic Resource Exchange Consortium* and the participating AGRE families. The Autism Genetic Resource Exchange is a program of Cure Autism Now and is supported, in part, by grant MH64547 from the National Institute of Mental Health to Daniel H. Geschwind (PI).
*The AGRE Consortium:
Dan Geschwind, M.D., Ph.D., UCLA, Los Angeles, CA; Maja Bucan, Ph.D., University of Pennsylvania, Philadelphia, PA; W. Ted Brown, M.D., Ph.D., F.A.C.M.G., N.Y.S. Institute for Basic Research in Developmental Disabilities, Long Island, NY; Joseph Buxbaum, Ph.D., Mt. Sinai School of Medicine, NY, NY; Rita M. Cantor, Ph.D., UCLA School of Medicine, Los Angeles, CA; John N. Constantino, M.D., Washington University School of Medicine, St. Louis, MO; T. Conrad Gilliam, Ph.D., University of Chicago, Chicago, IL; Clara Lajonchere, Ph.D, Cure Autism Now, Los Angeles, CA; David H. Ledbetter, Ph.D., Emory University, Atlanta, GA; Christa Lese-Martin, Ph.D., Emory University, Atlanta, GA; Janet Miller, J.D., Ph.D., Cure Autism Now, Los Angeles, CA; Stanley F. Nelson, M.D., UCLA School of Medicine, Los Angeles, CA; Gerard D. Schellenberg, Ph.D., University of Washington, Seattle, WA; Carol A. Samango-Sprouse, Ed.D., George Washington University, Washington, D.C.; Sarah J. Spence, M.D., Ph.D., UCLA, Los Angeles, CA; Matthew State, M.D., Ph.D., Yale University, New Haven, CT.; Rudolph E. Tanzi, Ph.D., Massachusetts General Hospital, Boston, MA.