We analyzed single-nucleotide variants (SNVs) in mGluR pathway genes by high-throughput sequencing in a cohort of 290 unrelated AGRE cases and 300 ethnically-matched controls (;
Table S2). Inclusion criteria for AGRE cases were a diagnosis of idiopathic (“non-syndromic”) autism and at least one affected sibling. AGRE cases are screened to exclude non-idiopathic (“syndromic”) autism secondary to known neurogenetic disorders such as fragile-X syndrome
[20]. Two distinct sets of pools were prepared from genomic DNA samples isolated from these cohorts for sequencing on the Illumina GAII and the Helicos HeliScope. An orthogonal strategy for sample pooling was used in which samples were arrayed in a matrix with 15 rows and 20 columns. Samples were pooled along rows to generate 15 pools of 20 samples each for GAII sequencing and pooled along columns to generate 20 pools of 15 samples each for Heliscope sequencing. Each sample representing a single case or control subject was thus identified by a unique combination of two pools representing its unique position within this matrix. Each genomic DNA pool was used as a template for PCR amplification of all coding exons from our panel of 18 mGluR pathway genes (240 exons comprising a total of 40,473 bases). PCR amplicons from each genomic DNA pool were concatenated and sheared to construct libraries for high-throughput sequencing. The average coverage per exon per pool was 610 for the GAII and 1,688 for the HeliScope. An average coverage of ≥10 per individual was achieved for 87% (GAII) and 97% (HeliScope) of exons across all pools. Sensitivity of variant detection was therefore generally limited by the lower coverage achieved on the GAII in this analysis.
We used confirmation by Sanger sequencing to evaluate the sensitivity and fidelity of SNV detection on each NGS platform and on both platforms combined. The false discovery rate for single variant occurrences (singletons) fell sharply without losing sensitivity when considering only SNVs detected concordantly on both platforms (
Fig. S1). We therefore limited our subsequent analysis to SNVs that were concordantly detected on both platforms.
Common SNVs were defined as those with allele frequencies ≥1%, while rare SNVs were defined as having allele frequencies <1%. The number of common SNVs did not differ between the AGRE and control groups (). Rare SNVs were modestly enriched in AGRE samples compared to controls (302 and 276, respectively). However, when we excluded rare SNVs that are either silent or present in both populations, and therefore presumed to be benign (), a significant enrichment of genetic variation was detected in the autism population, with 80 and 49 SNVs in AGRE and control groups, respectively
(P
=
0.001). When we additionally eliminated SNVs located in 5′ UTRs or deep intronic regions, thereby focusing on SNVs with potentially deleterious effects, there emerged a two-fold enrichment of variants in the autism population, with 58 and 29 SNVs in the AGRE and control groups, respectively (
P
=

0.0005), occurring in 49 and 32 individuals, respectively (, ,
Table S3). The two-fold enrichment of SNVs in the autism population persisted if we further excluded SNVs characterized as common in dbSNP (build 132), with 57 and 27 SNVs in the AGRE and control populations, respectively (
P
=

0.0002).
| Table 1Enrichment of rare functional variants in mGluR pathway genes in autism cases detected by high-throughput sequencing. |
| Table 2Rare, potentially deleterious variants identified in mGluR pathway genes in autism cases. |
We applied several commonly used computational tools for pathogenicity prediction to assess the functional impact of the rare, potentially deleterious missense variants identified in the AGRE and control groups. Since the performance and reliability of methods for pathogenicity prediction varies widely, and their results typically correlate poorly
[21], we compared the predictions derived from SIFT, PolyPhen2, SNP&GO and MutPred. Although SIFT did not predict a significant difference between groups, PolyPhen2, SNP&GO, and MutPred each predicted a 2- to 3-fold enrichment of damaging missense variants in the AGRE group. Overall, comparably high proportions of the variants classified as potentially deleterious in the AGRE and control groups (68% and 71%, respectively) were predicted to be functionally damaging by at least one of the four prediction tools, supporting the notion that disruptive variants within mGluR pathway components occur at a higher rate in the autism relative to the control population.
At the level of individual pathway genes, we identified a significant excess of SNVs in the autism population for the
TSC1,
TSC2,
SHANK3, and
HOMER1 genes (
P<0.05). Causal roles for
TSC1 and
TSC2 have previously been demonstrated in syndromic autism.
TSC1 or
TSC2 mutations can cause TSC, a syndromic disorder characterized by tumor growth in multiple organs, including the brain. Although the manifestations of TSC include ASD in up to 50% of cases
[22], our findings additionally implicate
TSC1 and
TSC2 as risk genes for non-syndromic autism independent of their causative role in TSC. Consistent with this view, the majority of the rare, potentially disruptive
TSC1/TSC2 SNVs identified in the AGRE population are novel, and none of these SNVs has been identified previously as a cause of TSC (
http://chromium.liacs.nl/LOVD2/TSC/home.php). Our identification of increased genetic variation in
SHANK3 in autism cases supports the emerging view of
SHANK3 as an important autism-risk gene
[23]. One missense variant observed in our study (R300C) was previously identified as a potential risk factor for ASD
[23]. In addition, we identified a number of novel rare
SHANK3 SNVs in the autism population (). The over-representation of rare, potentially disruptive variants in genes previously implicated in ASD (
TSC1,
TSC2,
SHANK3) provides validation of this approach to detect genes that contribute genetic risk in autism.
The fourth gene displaying a significant enrichment of autism-associated rare SNVs,
HOMER1, has not previously been implicated in autism. Homer1 is a PSD-localized scaffolding protein that interacts with a variety of PSD proteins, including mGluRs and Shank proteins
[24]. Binding of Homer1 to mGluRs promotes trafficking of mGluRs to the postsynaptic membrane and couples mGluR5 to the mTOR signaling pathway
[25]. Homer and Shank proteins interact to form an extended polymeric platform required for recruitment and assembly of synaptic proteins and structural integrity of dendritic spines
[26]. Consistent with this function, the Homer-Shank interaction has been shown to promote morphological and functional maturation of dendritic spines
[27]. We identified multiple rare missense variants in
HOMER1 in AGRE cases but not in controls. All of the identified missense variants in
HOMER1 alter residues that are invariant among mammalian species, and all but one is invariant across vertebrate species (). Two of these variants (c.195G>T, M65I and c.290C>T, S97L) localize to the EVH1 (Ena/VASP homology 1) domain of Homer1, which binds to Pro-Pro-Ser-Pro-Phe motifs in mGluR1 and mGluR5 and a Pro-Pro-Glu-Glu-Phe motif in Shank3
[24]. A third potentially damaging SNV in
HOMER1 (c.425C>T, P142L) affects one of the conserved prolines within the P-motif of the CRH1 (conserved region of Homer 1) domain, which serves as an internal binding site for the EVH1 domain. It has been proposed that the P-motif competes for binding of the Homer1 EVH1 domain to the proline-rich motif in target proteins such as mGluRs, thereby modulating Homer1 homo-multimerization and mGluR interaction. Interestingly, one of the
GRM5 variants (c.3503T>C, L1168P) detected in AGRE samples is located in close proximity to the conserved Pro-Pro-Ser-Pro-Phe Homer1-binding motif in mGluR5. In addition, we identified an SNV in the
HOMER1 3′ UTR (c. 1080 C>T) only in the autism and not in the control population. Growing evidence suggests an important role for 3′ untranslated regions (UTRs) as the sites of pathogenic variation due to their diversity and density of
cis-acting regulatory elements
[28],
[29]. In particular, genetic variants that alter microRNA-binding sites have been implicated in the pathogenesis of a variety of human diseases, including the neuropsychiatric disorder Tourette's syndrome
[30],
[31]. The identified
HOMER1 3′ UTR variant, which is located 15 nucleotides distal to the translation termination codon, lies within a cluster of predicted microRNA binding sites and alters predicted seed pairing for several microRNAs, including miR-96, miR-182, miR-203, and miR-513a-3p (miRanda and Microcosm algorithms,
www.microrna.org;
www.ebi.ac.uk/enright-srv/microcosm
[32],
[33]) (). Based on the predicted effects on microRNA binding, this variant may perturb the efficiency and/or tissue specificity of
HOMER1 mRNA translation and protein expression.
To assess further the pathogenicity of the rare, potentially disruptive
HOMER1 variants uniquely identified in the autism population, we analyzed co-segregation of these variants with autism (). Parents and siblings of probands in the families carrying each of the five
HOMER1 variants were genotyped for the relevant
HOMER1 variant as well as any other rare variants detected in the proband. Four of the variants (c. 290C>T, c.425C>T, c. 968G>A, and c.1080 C>T) co-segregated perfectly with the autism phenotype in affected and unaffected children. Probands from two of these families carried a second rare variant in addition to the
HOMER1 variant, but these other variants did not co-segregate with the autism phenotype (
HOMER1 c.290C>T and
SHANK3 c.898C>T;
HOMER1 c.195 and
PIK3CA c.2294+19C>T). The fifth
HOMER1 variant (c.195G>T, the only
HOMER1 variant carried by a female proband) was not detected in an affected sibling, suggesting that this variant may modify autism risk. Interestingly, the c.968G>A variant was present in two affected male children but absent in both parents, suggesting that this variant arose
de novo in one of the parental germlines. This finding is consistent with increasing evidence that
de novo CNVs and SNVs with high penetrance play major roles in autism
[6],
[34],
[35]. The remaining four variants were transmitted to affected children by unaffected carriers, possibly reflecting incomplete penetrance of pathogenic variants among parents in families with multiplex autism
[34],
[36].
Although significant enrichment of rare, potentially disruptive variants in AGRE samples relative to controls was limited to the
TSC1, TSC2, SHANK3, and
HOMER1 genes, individual variants in additional genes suggest a role for the Ras/ERK cascade in autism susceptibility. One AGRE sample harbored an SNV in
MAP2K2 (c.581-1G>T) that alters a conserved splice-acceptor site; skipping of the adjoining exon would result in a frameshift mutation within the kinase domain and is thus highly likely to be damaging. Familial segregation analysis revealed the presence of this variant in a non-affected as well as an affected sibling, indicating reduced penetrance. A potentially damaging missense variant was also detected in
HRAS (c.383G>A, R128Q); this substitution alters a highly conserved basic residue required for interaction of GTP-bound H-Ras with the plasma membrane and Raf
[37]. Familial segregation analysis revealed absence of this variant in an affected sibling of the proband, suggesting a modifying rather than causal role for this variant. Mutations in
MAP2K2 and
HRAS are responsible for cardiofaciocutaneous (CFC) and Costello syndrome, respectively, related monogenic disorders characterized by mental retardation, facial dysmorphism, cardiac defects and a high prevalence of autistic features
[38],
[39]. CFC and Costello syndromes are thought to be caused by gain-of-function mutations that activate the Ras/ERK pathway, whereas the
MAP2K2 and
HRAS variants that we identified in autism cases are most compatible with loss of protein function. These findings raise the possibility that rare genetic variation within the Ras/ERK cascade may contribute to non-syndromic autism risk independent of this pathway's role in CFC and Costello syndromes.