These studies provide a novel approach to determining candidate genes for autism through the use of peripheral cell lines derived from individuals with ASD. The observations represent a model for the development of a diagnostic screen for autism based on biomarker detection in blood, which is an easily accessible tissue.
In this study, DNA microarrays containing ~40 K human cDNA probes were utilized to examine differences in gene expression profiles in LCL derived from 5 pairs of monozygotic twins with ASD. Three sets of twins were discordant with respect to clinical diagnosis of autism, and 2 sets (with both co-twins diagnosed as autistic) differed with respect to severity of language impairment. We specifically chose this experimental model (direct comparison of identical twins) because differential gene expression in blood leukocytes from monozygotic twins has been reported to be much more restricted than between unrelated controls and, furthermore, the differentially expressed genes exhibited "random variations", showing no specific preference for any functional class [
22]. The most remarkable finding of this study is that global functional analysis of the significant differentially expressed genes in LCL from these 5 sets of twins identifies "Nervous system development and function" as a top "high level function" that is significantly enriched across the 5 gene expression datasets (Table ). Moreover,
in silico mapping of our most differentially expressed genes across as well as within the twin sets demonstrates that many of these genes are located in or close to chromosomal regions previously identified as autism susceptibility loci by genetic analyses (Table and
Additional file 4). Quantitative RT-PCR analysis has further confirmed the differential expression of a subset of our novel candidate genes in the majority of twin sets studied.
Several of these candidate genes and their associated gene networks may provide insight into potential mechanisms involved in the autistic phenotype(s). One of the striking results of the pathway analyses is that a relatively large number of the differentially expressed, neurologically relevant genes are linked in networks that are centered on genes involved in inflammation (see Figs. and ). The network genes with reported neurological functions include the proteins ASS, ALOX5AP (FLAP), CD44, CHL1, DAPK1, EGR2, F13A1, FLT1, IL6ST, NAGLU, PTGS2, and ROBO1 (See Table ). The protein ASS regulates the rate-limiting step involved in nitric oxide (NO) production through regeneration of arginine from citrulline, a byproduct of the nitric oxide synthetase (NOS) reaction [
31]. Since NO is a major signaling molecule in the brain that has been implicated in several psychiatric disorders, including autism [
32], the increased expression of ASS may be of potential relevance to the autistic phenotype. ASS has also been shown to be induced in a rat model of brain inflammation [
33], which would be consistent with the hypothesis that neural inflammation may play a role in autism [
34]. DAPK1, a cell death-associated serine/threonine kinase which is involved in suppression of integrin activity and disruption of matrix survival signals [
35], is also induced by inflammation [
36]. Interestingly, the expression of FLT1 (VEGF receptor 1) is also regulated by inflammatory cytokines as well as by NO [
37]. Furthermore, the fact that IL6ST (gp130) is increased in LCL from the more severely affected twin, may complement previous observations that IL-6 is the most elevated inflammatory cytokine in the middle frontal gyrus and anterior cingulate gyrus of brain autopsy tissue from autistic individuals [
34]. While upregulation of ASS, DAPK1, FLT1, and IL6ST may be responses to inflammation, ALOX5AP (FLAP) and PTGS2 (COX-2) mediate inflammation through the production of leukotrienes [
38] and prostaglandins [
39]. Interestingly, 5-lipoxygenase, the target of FLAP activation, has been implicated in aging and neurodegenerative diseases [
40], as well as other psychiatric disorders [
41], including anxiety and depression, which are frequently co-morbid conditions of autism, while a COX-2 inhibitor, celecoxib, has been shown to have therapeutic effects in major depression [
42], further suggesting a role for inflammatory processes in psychiatric disease. Collectively, the potential involvement of these specific genes that are associated with neurological function and disease and their presence in pathways regulated by inflammatory mediators lend further support to the neural inflammation model for autism [
34], which may be also manifested by immune dysfunctions commonly observed in autism [
43].
In addition to the possible role of genes involved in inflammation, a review of the gene list in Table suggests several additional recurring biological themes among the differentially expressed genes with neurological functions: neuronal survival, neurite extension/guidance, and myelination. In this regard, altered expression of EGR2, the most down-regulated gene across 5 twin sets, may be particularly significant (See Table and
Additional file 2). EGR2 (Krox-20) is a transcription factor involved in the development of the brain and peripheral nervous system, routing of axons, and myelination [
44,
45]. Some of these functions may be related to EGR2-mediated regulation of ROBO1, which is involved in neuronal differentiation and axon guidance [
46,
47], and integrin beta-7 (ITGB7) which has been implicated in chronic demyelinating disease [
48]. The expression levels of all three of these genes are relatively reduced with increased severity of autism or language impairment (Table ). The involvement of cell migration in the pathophysiology of autism is also implicated by the altered expression of CHL1, a novel neural cell adhesion molecule that is involved in neurite migration, outgrowth, connectivity, and survival. Deficiency in CHL1 has been shown to be associated with mental and motor impairments as well as with alterations in exploratory and emotional behavior in mice [
49,
50], characteristics that are often associated with autism. However, the effect of CHL1 overexpression, which we observe to be associated with the more severe phenotype, has yet to be determined. While the function of such neurologically relevant genes in lymphoblastoid cell lines is unknown, there is growing evidence that gene expression is under genetic control in LCL, as well as in other cells, with one study showing that 31% of the differential expression in LCL among unrelated individuals was heritable [
21]. Thus, it is reasonable to postulate that hereditary factors that are responsible for the development of the autistic brain might also be manifested in the LCL as differentially expressed genes. If expression of these genes can be shown to be consistently altered in LCL in case-control studies on a larger sample of unrelated individuals, these cells, and by inference their precursor blood lymphocytes, can potentially be used as reporter cells for diagnosis of ASD. While we have focused on differentially expressed genes of neurological relevance in this study, it should be noted that the biomarkers for autism in LCL or lymphocytes need not have specific neurological functions (as we have also detected and confirmed differential expression of "non-neuronal" genes). Given that ASD is most probably a multigene disorder of varying etiology, a biomarker screen for ASD would likely include a panel of genes consistently associated with ASD phenotypes, in which diagnosis for the disorder will depend upon differential expression of a defined percentage of genes within the consensus set.
The observed relationship between differential gene expression and severity of ASD between monozygotic twins suggests a role for epigenetic factors in ASD. A recent report on normal monozygotic twins indicates that epigenetic differences arise over time, increasing with age and with physical separation from each other after birth [
45]. Indeed, epigenetic differences between monozygotic twins have been examined as possible causes for discordancy in schizophrenia as well as bipolar disorder [
51-
53]. Possible epigenetic mechanisms leading to differences in gene expression include differential methylation, differences in histone acetylation, and micro RNA, although there is no available evidence linking any of these to autism at this time. On the other hand, a mutation in a methyl-CpG binding protein, X-linked MeCP2, has been identified as being involved in 80% of all cases of Rett Syndrome [
54], a developmental disorder which overlaps ASD, thus implicating the importance of methylation-dependent gene expression in at least this related disorder. Interestingly, though ubiquitously expressed [
55], mutated MeCP2 induces a specific neuronal dysfunction, i.e., Rett Syndrome. One could therefore postulate that differential methylation or differential histone acetylation might give rise to differential expression in LCL from monozygotic twins with ASD and test for global changes in methylation or histone acetylation as done by Fraga et al [
45], or for specific changes within a given candidate gene. Such epigenetic modifications in turn could be in response to environmental factors, stochastic processes, or immortalization procedures, which can persist even after the modifying stimulus (eg., inflammation) is removed [
56]. If present, these differences could be further tested by evaluation of the methylation/acetylation patterns of DNA/histones in primary lymphocytes from monozygotic twins discordant in severity of autism or language impairment within autism which, while interesting, is beyond the scope of this study.
Regardless of origin, the gene expression differences between monozygotic twins who present with differential severity along the autism spectrum or within a specific behavioral domain (eg., language) are potentially useful, not only as biomarkers for ASD, but also as indicators of genes or metabolic/signaling pathways that may contribute to the autistic phenotype. While our short list of candidate genes (Table ) focuses on genes with known neurological functions that are similarly up- or down-regulated across twin sets affected by ASD, the set of differentially expressed, neurologically relevant genes that are unique to a given twin set may also be important to the determination of a specific autistic phenotype. Indeed, comparison of the pathways represented in the respective datasets of individual twin pairs reveals not only overlapping genes but also neurologically relevant genes that are differentially expressed in only one of the twin pairs (see
Additional file 2). Inasmuch as our microarray analyses
directly compared genetically matched individuals who differ only in degree of expression of autistic symptoms, it is likely that other genes, not identified in our study, also play a role in the pathophysiology of autism. This experimental design possibly explains why the candidate genes identified here are different from those reported by an earlier genomic study [
20] which compared autopsy brain tissues from autistic and normal (nonautistic) controls (i.e., case-control studies). On the other hand, it is interesting that many of our novel genes map closely to genetically identified autism susceptibility genes/loci or QTL (Table and
Additional file 4).
Aside from identifying novel candidate genes for autism, our study also illustrates the need for phenotype definition or subgrouping according to severity along a specific behavioral domain for biological studies of autism. Specifically, the results show that the differential gene expression profiles of concordantly autistic twins with differential severity of language impairment mirror some of the differences in gene expression which are observed in the twins with discordant diagnosis of autism, who also exhibit differential language deficits. Thus, for case-control studies in which individuals from the general population are compared against unrelated controls, subgrouping the autistic individuals by phenotype or stratifying them according to severity of symptoms may provide more clarity in analyzing biological data. Towards this goal, we have used several different clustering methods commonly used in DNA microarray analyses to divide over 1300 autistic individuals into endophenotypic subgroups (eg., language, nonverbal communication, and savant skills) based on item scores on the ADIR questionnaire (manuscript in preparation). Based on these methods, the twin siblings analyzed in this study, including those who were both diagnosed as autistic, each fall into different phenotypic clusters (unpublished data), with the exception of one set of twins who were discordant in the diagnosis of autism. These "endophenotypic" differences may therefore account for some of the differences in gene expression profiles between the twin siblings (i.e., co-twins) as well as among the different sets of twins. To test the ability of our clustering algorithm to restrict phenotypic and biological heterogeneity, we evaluated the short list of candidate genes in Table by qPCR in an additional set of "concordant" autistic twins in which both co-twins exhibit the severely language-impaired phenotype. Results showed that, for this twin pair, there are no differences in expression of the candidate genes exceeding a log2ratio of ± 0.58 (unpublished data).