|Home | About | Journals | Submit | Contact Us | Français|
The signaling pathways orchestrating both the evolution and development of language in the human brain remain unknown. To date, the transcription factor FOXP2 (forkhead box P2) is the only gene implicated in Mendelian forms of human speech and language dysfunction1,2,3. It has been proposed, that the amino acid composition in the human variant of FOXP2 has undergone accelerated evolution, and this change occurred around the time of language emergence in humans4,5. However, this remains controversial, and whether the acquisition of these amino acids in human FOXP2 has any functional consequence in human neurons remains untested. Here, we demonstrate that these two amino acids alter FOXP2 function by conferring differential transcriptional regulation in vitro. We extend these observations in vivo to human and chimpanzee brain, and use network analysis to identify novel relationships among the differentially expressed genes. These data provide experimental support for the functional relevance of changes in FOXP2 that occur on the human lineage, highlighting specific pathways with direct consequences for human brain development and disease. Since FOXP2 has an important role in speech and language in humans, the identified targets may have a critical function in the development and evolution of language circuitry in humans.
The amino acid structure of FOXP2 had been highly conserved along the mammalian lineage until the common ancestor of humans and chimpanzees, when the human variant of FOXP2 acquired two different amino acids under positive selection, which has been interpreted as evidence for accelerated evolution4,5. To test whether the amino acids under positive selection in human FOXP2 have a distinct biological function, which would support the role of these changes in evolution, we expressed either human FOXP2 or the same construct mutated at two sites to yield the chimpanzee amino acid content, FOXP2chimp, in human neuronal cells without endogenous FOXP2 (Fig. 1a–f). Exogenous FOXP2 protein expressed from both constructs was localized in the nucleus as determined by immunocytochemistry (Fig. 1c–e) and subcellular fractionation (Fig. 1f), consistent with its endogenous expression. To determine if modifying two amino acids leads to changes in gene expression, we conducted whole genome microarray analysis. We identified 61 genes significantly upregulated and 55 genes downregulated by FOXP2 compared to FOXP2chimp (Supplementary Table 1), as well as genes regulated by both FOXP2 and FOXP2chimp (Supplementary Table 2). Interestingly, FOXP2chimp overexpression resulted in more changes in gene regulation than FOXP2 (Supplementary Table 3). In replicate experiments in a different human neuronal cell line, FOXP2chimp again regulated more genes than FOXP2 even though its expression was higher than FOXP2 in these cells (data not shown). To control for any potential confounding effects of FOXP2 levels, we performed correlations of the levels of every gene on the array to either FOXP2 or FOXP2chimp levels, as well as performed random permutation testing, and found no significant differences between other genes’ correlations to either FOXP2 or FOXP2chimp. These data indicate that the differentially expressed genes are not due to different levels of FOXP2 or FOXP2chimp, and are a true indication of differential transcriptional regulation by these two proteins.
To confirm the validity of differentially expressed FOXP2 target genes, we conducted qRT-PCR using independent RNA samples. We confirmed 93% of the FOXP2 upregulated genes and 75% of the downregulated genes examined (Fig. 1g–h and Supplementary Figure 1). Five genes confirmed by qRT-PCR (COL9A1, ROR2, SLIT1, SYK, and TAGLN; Fig. 1g–h and Supplementary Figure 1) were previously identified as direct FOXP2 targets using ChIP-chip6,7. Sixty percent of promoters of the identified differentially expressed genes have at least one canonical FOXP2 binding site, 92% have at least one forkhead domain binding site, and 99% have at least one “core” FOXP2 binding site (Supplementary Table 4). The canonical FOXP2 binding site CAAATT, as well as the core site AAAT, is significantly enriched in the downregulated genes (P=3.3e-04 and P=8.6e-03, respectively) compared to randomly permuting the same number of promoters from the genome. Genes with promoters containing a canonical FOXP2 binding site are likely to be direct FOXP2 or FOXP2chimp targets.
To confirm that these findings were not an artifact of the cell lines used, we further assessed whether a different primary neural cell, human neural progenitors (NHNPs), would exhibit similar differential regulation by FOXP2 and FOXP2chimp. We confirmed one-third of the genes examined in these human cells using both a different method of gene transduction, and populations of cells with greater levels of FOXP2chimp compared to human FOXP2 over-expression, which complements the SH-SY5Y data to further show that the observed relationships are not due to FOXP2 levels (Supplementary Figure 2). As an additional level of validation and to extend the findings to the level of protein, we confirmed two genes, CACNB2 and ENPP2, by immunoblotting in additional SH-SY5Y cell lines (Supplementary Figure 3).
To explore the potential function of the differential FOXP2 targets, we determined enrichment of gene ontology (GO) categories. GO categories enriched for genes upregulated by FOXP2 compared to FOXP2chimp are involved in transcriptional regulation of gene expression and cell-cell signaling. Those GO categories enriched for genes downregulated by FOXP2 compared to FOXP2chimp are important for protein and cell regulation (Supplementary Table 5). These data support the idea that FOXP2 and FOXP2chimp have distinguishable downstream effects as reflected by their differences in gene regulation.
To determine the potential mechanisms by which FOXP2 or FOXP2chimp might differentially regulate gene expression, we first examined whether either protein preferentially interacts with FOXP1 or FOXP4, two proteins known to heterodimerize with FOXP28. Both FOXP2 and FOXP2chimp co-localized with FOXP1 in the cell nucleus, co-immunoprecipitated with FOXP1 as evidenced by immunoblotting, and co-immunoprecipitated with both FOXP1 or FOXP4 when assayed by mass spectrometry (Fig. 1c–e, 2a–b, and Supplementary Fig. 4b–g), ruling out a major difference in FOXP1 or FOXP4 interaction causing differential gene expression. Mass spectrometry showed no significant difference in either co-immunoprecipitation experiment, indicating that differences in hetero- or homo-dimerization did not underlie the observed differences in gene expression between the chimpanzee and human FOXP2. We also tested whether changes in cell proliferation could account for gene expression differences, but did not find significant changes in growth with either FOXP2 construct (Fig. 2c).
We next assessed whether FOXP2 and FOXP2chimp expression led to differential promoter transactivation of target genes. We selected eight genes confirmed by qRT-PCR that also contained at least one forkhead binding site (Supplementary Table 6). Six of the promoters tested showed differential regulation by FOXP2 compared to FOXP2chimp in the same direction as the microarrays (Fig. 2d–e), while two did not demonstrate significant transactivation in either direction (data not shown). In contrast, a canonical FOXP2 binding site in triplicate alone, outside of a genomic context, was regulated equally by both FOXP2 and FOXP2chimp (Supplementary Figure 5). Given the complexity of cis-acting gene transactivation elements, these data are particularly compelling considering our use of simplified 5′ promoter regions. These data demonstrate that at least a subset of differentially regulated genes is also differentially transactivated by FOXP2 and FOXP2chimp, indicating they are likely direct FOXP2 targets.
To place these gene expression changes within a more systematic context, we applied weighted gene co-expression network analysis9,10 to the entire SH-SY5Y microarray data set to examine co-regulation of gene expression across all genes. We uncovered two networks where the module eigengene was driven by differences in FOXP2 and FOXP2chimp, and one network driven by similar gene regulation (Figure 3 and Supplementary Figure 6). Using this unsupervised analysis, we found additional genes of interest that do not meet the criteria for differential expression, but that are co-regulated with differences in FOXP2 and FOXP2chimp expression (Supplementary Table 7). Strikingly, two of the genes with the most connections, so-called “hub” genes, in one of the differential networks are DLX5 and SYT4, two genes important for brain development and function11,12.
To extrapolate these findings to true in vivo expression and provide external validation, we compared the differentially expressed genes in SH-SY5Y cells to differentially expressed genes from adult human and chimpanzee brain tissue. We performed microarray analysis on tissue from three brain regions where FOXP2 is expressed in developing brain: caudate nucleus, frontal pole, and hippocampus. We examined gene expression in human compared to chimpanzee for each brain region separately as well as for all brain regions combined, for a total of eight comparisons. There was a significant overlap in seven out of eight of these comparisons, a remarkable convergence with the in vitro data (Table 1). These data are particularly notable, since the tissue was from adult brain. We surmise that a subset of the overlapping differentially expressed genes found in adult brain is the result of differential functions by FOXP2 in the developing brain, and may lead to increased vulnerability to disease. For example, mutations in both FGF14 and PPP2R2B lead to spinocerebellar ataxia (SCA27 and SCA12, respectively), which involves motor-related speech defects13,14. Since both of these genes play a critical role in cerebellar function, it is of note that patients with FOXP2 mutations have decreased gray matter in the cerebellum15, and Foxp2 knockout mice have their most pronounced morphological phenotype in the cerebellum16. Mutations in COL9A1 lead to Stickler syndrome in which patients have craniofacial abnormalities17, and patients with mutations in GJA12 present with ataxia, nystagmus, other motor impairments, and often mental retardation18.
While comparisons of developing brain between human and chimpanzees are challenged by a lack of tissue, a recent study examined gene expression in many regions of human fetal brain19. Comparing the list of 116 differentially expressed genes with those focally expressed during human fetal development, we find 14 genes specifically expressed in one brain region, including FOXP2 (Supplementary Table 8). Two regions of the human fetal brain with high FOXP2 expression19, perisylvian cortex and cerebellum, have a significant number of enriched genes that overlap with the differentially expressed FOXP2 and FOXP2chimp genes (P=1.1e-04 and P=1.3e-04, respectively; Supplementary Table 8). A significant number of the differentially expressed genes are also associated with human-specific accelerated highly conserved noncoding sequences (haCNS), but not with chimpanzee highly conserved noncoding sequences (P=1.2e-06 and P=0.04; Supplementary Table 8)19,20. We confirmed a number of these genes, such as GRM8, MAOB, PPP2R2B, PRICKLE1, RUNX1T1 either by qRT-PCR and/or with the adult in vivo dataset (Figure 1 and Table 1). Together, these data suggest that the FOXP2 differentially expressed genes identified here may have important roles in brain development and patterning, and may also have evolved cis-regulatory elements important for their expression specifically in human brain.
Previously, we identified ChIP-chip targets of FOXP2 that themselves were also under positive selection6. We hypothesized that networks of genes important for language circuitry had been positively selected through selective pressure on human brain evolution. Thus, we also examined whether any differential FOXP2 targets were themselves under positive selection. Five genes (AMT, C6orf48, MAGEA10, PHACTR2, and SH3PXD2B) met the standard criteria of Ka/Ks ≥ 1.0 for positive selection on the human lineage (Supplementary Table 9)21. These data, along with the haCNS and expression data mentioned above, suggest that a subset of differential FOXP2 targets may have co-evolved to regulate pathways involved in higher cognitive functions.
The positive selection of two amino acids in human FOXP2 was previously hypothesized as a mechanism by which human FOXP2 might assume a novel biological function with implications for speech and language evolution4,5. A recent study made an elegant attempt to examine the role of these two amino acids by generating a transgenic mouse with the human version of FOXP222. These mice exhibit a number of interesting phenotypic alterations including increases in dendritic length in striatal neurons and changes in ultrasonic vocalizations, as well as some modest changes in gene expression. Although the mouse is an experimentally tractable model system, from a strictly evolutionary standpoint, the interpretation of data obtained in the mouse specifically for the study of human evolution is challenged by the vast differences in human and mouse brain and the amount of time since the human and mouse common ancestor diverged (70 million years23). Here, we demonstrate that these two amino acid changes have a functional consequence in human cells, validate these differences in vivo in tissue, and elucidate some of the downstream pathways affected by this adaptive evolutionary change.
Using whole genome microarrays, we uncovered genes that are differentially regulated upon mutation of these two amino acids, including some with functions critical to the development of the human CNS. Moreover, this study reveals enrichment of differential FOXP2 targets with known involvement in cerebellar motor function, craniofacial formation, and cartilage and connective tissue formation, suggesting an important role for human FOXP2 in establishing both the neural circuitry and physical structures needed for spoken language. The significant overlap of human FOXP2 targets in cell lines with genes enriched in human compared to chimpanzee brain tissue presents the possibility that human and chimpanzee FOXP2 have differentially regulated targets during brain development. As suggested by King and Wilson over 30 years ago24, and reaffirmed by the sequencing of both the human and chimpanzee genomes, the phenotypic differences exhibited by humans and chimpanzees cannot be explained by differences in DNA sequence alone, and are likely due to differences in gene expression and regulation. Previous microarray studies identified differences in gene expression between human and chimpanzee brains25,26. Here, we link new whole genome expression microarray data from human and chimpanzee brain to direct differences in gene regulation by the human and chimpanzee version of the transcription factor FOXP2. Since normal FOXP2 function is critical for speech in humans, these differentially regulated targets may be relevant to the evolution and establishment or function of pathways necessary for speech and language in humans.
SH-SY5Y cells (ATCC) and human fetal neuronal progenitors (Lonza) were grown according to the manufacturer’s instructions, with some modifications (see Methods).
Total RNA was extracted using Qiagen’s RNeasy kit. Illumina HumanRef-8 v2 (SH-SY5Y samples) or v3 (tissue samples) were used and analyzed as described27. Sample information is in Methods.
Full Methods accompany this paper.
The following antibodies were either used for immunoblotting (IB) or immunofluorescence (IF): anti-FLAG (mouse monoclonal, Sigma; 1:10,000 (IB), 1:10,000 (IF)); anti-GAPDH (mouse monoclonal, Chemicon; 1:2500 (IB)); anti-beta-tubulin (rabbit polyclonal, Abcam; 1:1000 (IB); anti-FOXP1 (6; 1:5000 (IB), 1:1000 (IF); anti-CACNB2 (mouse monoclonal, Abcam; 1:100 (IB)); anti-ENPP2 (rabbit polyclonal, Cayman Chemical; 1:400 (IB)); goat anti-rabbit horseradish peroxidase (Cell Signaling, 1:2500); goat anti-mouse horseradish peroxidase (Chemicon, 1:5000); goat anti-mouse Alexa Fluor 488 (Invitrogen, 1:1500); goat anti-rabbit Alexa Fluor 594 (Invitrogen, 1:1500).
Stable SH-SY5Y cell lines were generated by transfecting cells with pCMV-Tag4a expression constructs using FuGENE (Roche Applied Science) according to the manufacturer’s instructions. Populations of stable cells were selected using 1mg/ml Geneticin (Invitrogen). Multiple independent lines were generated from independent transfections. Stable human fetal neuronal progenitors cell lines were generated by transducing cells with lentiviruses as previously described28. FOXP2-producing lentiviral vectors were generated by replacing the eGFP in pLUGIP (ATCC) with FOXP2.
Nuclear extract were incubated with either 1 μg of FLAG antibody (Sigma) or a polyclonal FOXP1 antibody6.
Equal numbers of cells (2.0E+04) were plated on time zero and counted every subsequent day after trypsinization using a hemacytometer.
293T cells (ATCC) were transfected with 50ng of reporter construct expressing Photinus pyralis (firefly) luciferase, 1ng of Renilla luciferase plasmid (pRL-EF), and 50ng of pCMV-Tag4a FOXP2 expression plasmid using FuGENE (Roche Applied Science) according to the manufacturer’s instructions. Forty-eight hours later, cells were lysed and analyzed using the dual luciferase reporter assay system (Promega) according to the manufacturer’s instructions. Co-transfection of Renilla was used for transfection normalization, and values were additionally normalized to cells transfected with a promoter-less luciferase construct. Promoter information is in Supplementary Table 6. The canonical FOXP2 binding site driving luciferase was generated by cloning AATTTG in triplicate into pGL4 (Promega).
GO analysis was performed as described6 using DAVID (http://david.abcc.ncifcrf.gov). The differentially expressed genes were compared to all of the genes on the microarrays and a P value computed using a Fisher’s Exact Test.
Whole cell protein lysates were generated and immunoblotted as described28.
Cells were grown on glass coverslips, fixed in 2% paraformaldehyde, and permeabilized in 0.2% Triton-X. TBST containing 10% milk, and 10% normal goat serum was used as blocking solution at room temperature for one hour. Antibodies were diluted in TBS with 0.25% BSA, 0.25% normal goat serum and 0.1% Triton-X and applied to cells overnight at 4°C. Secondary antibodies were diluted in blocking solution and added at room temperature for one hour. Coverslips were mounted to glass slides and images taken using a Zeiss Axio Imager D1.
FOXP2 immunoprecipitates were precipitated by the addition of trichloroacetic acid and proteolyzed by the sequential addition of Lys-C and trypsin proteases29. Digested peptide samples were then analyzed by mass spectrometry as described29. Proteins were considered to be present in a sample if at least two peptides per protein were identified using a false positive rate of less than 5% per peptide as determined using a decoy database strategy30.
For the SH-SY5Y data, we analyzed four biological replicates of each genotype from three independently generated cell lines for a total of 12 microarrays per genotype. Each of these cell lines was created from populations of cells rather than single clones, and as such, the expression data represents changes from hundreds of independent integrations throughout the cells’ genomes. Further, as the endogenous FOXP2 expression is very low in SH-SY5Y cells, the potential confound of heterodimerization with endogenous human FOXP2 is mitigated in these cells. For the tissue data, we analyzed three to six independent samples for each brain region in each species.
For FOXP2 correlations, we computed the average correlation for each gene on the microarray to either the level of the human or the chimpanzee FOXP2. We then derived the absolute difference in correlation for each gene between the human and chimpanzee FOXP2 arrays. The average of these differences was not statistically different from performing the same test, but randomizing the correlation values for all of the genes on the arrays or using the values from only the differentially expressed genes. For promoter binding site calculations, we calculated the number of promoters from differentially expressed genes with a given motif and compared them to the average number from a random selection of the same number of promoters from the genome. We assumed a normal distribution and a Z-score less than 0.05 was called significant. Similar analysis was done for comparing genes with a haCNS and expression in human fetal brain. For microarray overlap comparisons, we included the number of differentially expressed genes as well as the total number of probesets on the microarrays for each comparison. We used a hypergeometric distribution test with 10,000 permutations to calculate the mean and standard deviation of the overlap. We assumed a normal distribution and a Z-score less than 0.05 was called significant.
Mutagenesis of pCMV-Tag4a/FOXP26 was carried out using GeneTailor Site-Directed Mutagenesis System (Invitrogen) according the manufacturer’s instructions using the following primers: site 1 (asparagine to threonine): F-5′-CCTCCTCGACTACCTCCTCCACAACTTCCAAAGC-3′; R-5′-GGAGGAGGTAGTCGAGGAGGAATTGTTAGTA-3′; site 2 (serine to asparagine): F-5′-ATGGACAGTCTTCAGTTCTAAACGCAAGACGAGA-3′; R-5′-TAGAACTGAAGACTGTCCATTCACTATGGAA-3′. Mutagenesis was confirmed by both sequencing and mass spectrometry.
WGCNA was performed as previously described9,10. Briefly, genes were chosen for inclusion into the network based on their consistent presence on the array and high coefficient of variation, and they were clustered based on their topological overlap. For each module, singular value decomposition (X = UDV′) was performed, and the expression was re-calculated without the first principal component because it corresponded to cell line differences. The modules reported in this study were created using expression data with the first principal component removed, as it represented an experimental batch effect.
We thank Michael Oldham for generating the Illumina microarray mask file, Jing Ou and Elizabeth Spiteri for performing site-directed mutagenesis, Leslie Chen for technical assistance, and Lauren Kawaguchi for lab management. Human tissue was obtained from the NICHD Brain and Tissue Bank for Developmental Disorders at the University of Maryland, Baltimore, MD (NICHD Contract No. N01-HD-4-3368 and N01-HD-4-3383). The role of the NICHD Brain and Tissue Bank is to distribute tissue, and therefore, cannot endorse the studies performed or the interpretation of results. This work was supported by grant R21MH075028, R37MH60233-06A1 (D.H.G.), T32HD007032, an A.P. Giannini Foundation Medical Research Fellowship, and a NARSAD Young Investigator Award (G.K.), T32MH073526 (K.W.), and a James S. McDonnell Foundation grant, JSMF#21002093 (T.M.P).
Author Contributions G.K. and D.H.G. designed the study, analyzed the data, and wrote the paper; G.K. performed all of the experiments; J.M.B. made contributions to an earlier phase of the project including generating cell lines, immunoblotting and qRT-PCR; K.W. performed statistical analysis and WGCNA analysis; G.C. conducted promoter analysis, and G.C. and F.G. analyzed the microarray data; Z.O.J. and J.A.W. performed mass spectrometry; S.P. performed some of the qRT-PCR; T.M.P. performed tissue dissections and provided non-human primate samples; all authors discussed the results and commented on the manuscript.
Author Information Gene expression data have been deposited into NCBI Gene Expression Omnibus (GEO; www.ncbi.nlm.nih.gov/geo), and are accessible using GEO series accession number GSE18142.