|Home | About | Journals | Submit | Contact Us | Français|
Intelligence is a highly heritable trait for which it has proven difficult to identify the actual genes. In the past decade, five whole-genome linkage scans have suggested genomic regions important to human intelligence; however, so far none of the responsible genes or variants in those regions have been identified. Apart from these regions, a handful of candidate genes have been identified, although most of these are in need of replication. The recent growth in publicly available data sets that contain both whole genome association data and a wealth of phenotypic data, serves as an excellent resource for fine mapping and candidate gene replication. We used the publicly available data of 947 families participating in the International Multi-Centre ADHD Genetics (IMAGE) study to conduct an in silico fine mapping study of previously associated genomic locations, and to attempt replication of previously reported candidate genes for intelligence. Although this sample was ascertained for attention deficit/hyperactivity disorder (ADHD), intelligence quotient (IQ) scores were distributed normally. We tested 667 single nucleotide polymorphisms (SNPs) within 15 previously reported candidate genes for intelligence and 29451 SNPs in five genomic loci previously identified through whole genome linkage and association analyses. Significant SNPs were tested in four independent samples (4,357 subjects), one ascertained for ADHD, and three population-based samples. Associations between intelligence and SNPs in the ATXN1 and TRIM31 genes and in three genomic locations showed replicated association, but only in the samples ascertained for ADHD, suggesting that these genetic variants become particularly relevant to IQ on the background of a psychiatric disorder. © 2010 Wiley-Liss, Inc.
Intelligence is a highly heritable complex trait, for which it is hypothesized that many genes of small effect size contribute to its variability [McClearn et al., 1997; Plomin, 1999]. Almost a decade after the completion of a rough draft of the human genome sequence, major efforts have been undertaken to identify common variations related to inter-individual differences in intelligence. Plomin and coworkers [Plomin, 1999; Plomin et al., 2001, 2004; Butcher et al., 2005, 2008] conducted several genome wide association (GWA) studies and showed significant association of a functional polymorphism in ALDH5A1 (aldehyde dehydrogenase 5 family) (MIM: 610045) on chromosome 6p with intelligence. Whole genome linkage scans for intelligence [Posthuma et al., 2005; Buyske et al., 2006; Dick et al., 2006; Luciano et al., 2006] reported two areas of genome-wide significant linkage for general intelligence on the long arm of chromosome 2 (2q24.1-31.1) and the short arm of chromosome 6 (6p25-21.2), and several areas of suggestive linkage (4p, 7q, 14q, 20p, 21p), following Lander and Kruglyak guidelines . The region on chromosome 6 (6p25-21.2) overlaps with the locus (6p24.1) identified in the genome-wide association study performed by Butcher et al. . Converging evidence from these whole genome studies provides support for the involvement of six different chromosomal regions, 2q24.1-31.1, 2q31.3, 6p25-21.2, 7q32.1, 14q11.2-12, and 16p13.3, in human intelligence (see Table I).
Apart from whole genome searches, several candidate gene-based association analyses have also reported significant associations with human intelligence [for a review see Posthuma and de Geus, 2006]. Based on a literature search, we identified 16 genes that have been associated with intelligence, as measured with an intelligence quotient test (IQ) at least once (P-value ≤0.05); DTNBP1 (dystrobrevin-binding protein 1) (MIM: 607145), ALDH5A1 (aldehyde dehydrogenase 5 family, member A1) (MIM: 610045), IGF2R (insulin-like growth factor 2 receptor) (MIM: 147280), CHRM2 (cholinergic muscarinic receptor 2) (MIM: 118493), BDNF (brain-derived neurotrophic factor) (MIM: 113505), CTSD (cathepsin D) (MIM: 116840), DRD2 (dopamine receptor D2) (MIM: 126450), KL (klotho) (MIM: 604824), APOE (apolipoprotein E) (MIM: 107741), SNAP25 (synaptosomal-associated protein, 25 kDa) (MIM: 600322), PRNP (prion protein (p27-30)) (MIM: 176640), CBS (cystathionine-beta-synthase) (MIM: 236200), COMT (catechol-O-methyltransferase) (MIM: 116790), DNAJC13 (DnaJ (Hsp40)) (GeneID: 23317), FADS3 (fatty acid desaturase 3) (MIM: 606150), and TBC1D7 (TBC1 domain family, member 7) (GeneID: 51256) (see Table II).
One of the major hurdles in identifying genes for complex traits is the need for replication to distinguish false positives from genuine associations. Of all reported genetic association studies in the literature, only 4% have shown replicable association according to a 2002 search [Hirschhorn et al., 2002]. At present, searching for “genetic” and “association” in PubMed gives 69950 hits (June 2010), while adding the keywords “replicated” or “validated” results in 1,318 studies. In other words, in this rough scan around 2.0% of the total reported genetic associations are reports of validated genetic association. The field of intelligence shows no exception. Of the 16 genes mentioned above, only three (CHRM2 [Comings et al., 2003; Gosso et al., 2006b, 2007; Dick et al., 2007], SNAP25 [Gosso et al., 2006a, 2008b], and BDNF [Tsai et al., 2004; Harris et al., 2006]) have shown replicated association with intelligence across independent samples. Several other genes (e.g., COMT, DTNBP1) have repeatedly shown association to a range of cognitive traits, but have not been replicated for association with intelligence as measured with an IQ test [Small et al., 2004; Savitz et al., 2006]. The reasons for lack of replication are many and include different ethnicity, insufficient sample size, different phenotype, opposite effect direction, or the fact that no replication was attempted at all.
The recent growth in publicly available data sets that contain whole genome association data as well as a wealth of phenotypic data serves as an excellent resource for rapid replication efforts. In the public domain, the Genetic Association Information Network (GAIN)—International Multi-Centre ADHD Genetics (IMAGE) sample is the sole GWA sample with information on IQ scores. In the current article, we use data from the IMAGE project, to (a) attempt replication of previous association findings for the 16 genes associated with normal intelligence at least once, and (b) explore the six chromosome regions previously implicated in human intelligence. Associations found in the IMAGE sample (discovery sample) are subsequently attempted for replication in four independent samples. Of these four samples one is ascertained for attention deficit/hyperactivity disorder (ADHD)—as is the IMAGE sample—and three are population-based samples. This allows to investigate whether associated single nucleotide polymorphisms (SNPs) found with the IMAGE sample are discovered due to an association with intelligence in an ADHD population, or are more generally associated with intelligence.
Subjects of the IMAGE project have been described in detail elsewhere [Brookes et al., 2006; Kuntsi et al., 2006; Neale et al., 2008]. Briefly, 947 European Caucasian nuclear families (2,844 individuals) from eight countries (Belgium, England, Germany, Holland, Ireland, Israel, Spain, and Switzerland) were included in the analysis. Families had been recruited based on having one child with ADHD and another who would provide DNA and quantitative trait data. In addition, both parents had to be available for DNA sampling.
IQ scores were available for 606 unrelated probands (for which we also had genotyping data, see below), of which 554 were males, with a mean age of 10.99 (SD 2.74). IQ was measured with the WISC-III-R (Wechsler Intelligence Scales for children) [Wechsler, 1991] or the WAIS-III-R (Wechsler Adult Intelligence Scale) [Wechsler, 1997] when appropriate (for children aged 17 and older).
The Verbal subtests Vocabulary and Similarities, and the Performance subtests Picture Completion and Block Design from the WISC were used to obtain an estimate of a child's IQ (prorated following procedures described by Sattler ). Age-appropriate national population norms were available for each participating site included in the IMAGE sample and these were used to derive standardized estimates of intelligence [Sonuga-Barke et al., 2008]. Standardized Full-Scale IQ (FSIQ) scores had a median of 101.6 and a mean of 100.7 (SD 15.7). Skewness of the distribution of IQ scores was 0.063 while kurtosis was −0.075. The Shapiro–Wilk test was non-significant (P = 0.517) suggesting that the distribution of IQ in the IMAGE sample did not deviate form a normal distribution (see Fig. 1).
The parents of the probands filled out the Conner's questionnaire, which provides a quantitative measure of ADHD symptoms. Correlations between the symptom scores on the Conner's Questionnaire and IQ were −0.066 (P = 0.074.) for the total score, −0.029 (P = 0.442), for the inattention score, and −0.084 (P = 0.024) for the hyperactivity/impulsivity score. Although this sample was originally ascertained for ADHD, and ADHD and IQ have been reported to be associated [Frazier et al., 2004], these findings suggest that in this sample IQ scores are normally distributed (as would be expected in a population-based sample) and are at most very weakly related to ADHD symptom scores. As there were mean fluctuations across collection sites, we calculated Z-scores within each site/country. The use of Z-scores ensures that there are no mean IQ differences left across subpopulations in the IMAGE sample and therefore rules out spurious associations due to the known subpopulation structure.
The IMAGE study was genotyped as part of the GAIN initiative, a public–private partnership of the FNIH (Foundation for the National Institutes of Health, Inc.) that currently involves NIH, Pfizer, Affymetrix, Perlegen Sciences, Abbott, and the Eli and the Edythe Broad Institute of MIT and Harvard University (http://www.fnih.org). Genotyping was conducted at Perlegen Sciences using their genotyping platform, which comprises approximately 600,000 tagging SNPs designed to be in high linkage disequilibrium with untyped SNPs for the HapMap populations. Genotype data were cleaned by NCBI (The National Center for Biotechnology Information). Quality control analyses were processed using the GAIN QA/QC Software Package (version 0.7.4) developed by Gonçalo Abecasis and Shyam Gopalakrishnan at the University of Michigan. Details of the genotyping and data cleaning process for the IMAGE study (study accession phs000016.v1.p1) have been reported elsewhere [Neale et al., 2008].
Briefly, we selected only SNPs with a minor allele frequency (MAF) ≥0.05 and Hardy–Weinberg equilibrium (HWE) (P ≥ 1 × 10−6). Genotypes causing Mendelian inconsistencies were identified by PLINK and removed (http://pngu.mgh.harvard.edu/purcell/plink/) [Purcell et al., 2007]. We additionally removed SNPs that failed the quality control metrics for the other two GAIN Perlegen studies (i.e., Major Depression Disorder [dbGAP study accession, phs000020.v1.p1) and Psoriasis (dbGAP study accession, phs000019.v1.p1), see Neale et al., 2008]. With this filtering, 384,401 SNPs were retained in the final data set. One genomic intelligence locus (7q32.1) could not be included in the analysis because all 10 SNPs inside this relatively small area failed the quality control. The APOE was also not included as no SNPs were genotyped in or near this gene. Fifteen genes (ALDH5A1, BDNF, CBS, CHRM2, COMT, CTSD, DNAJC13, DRD2, DTNBP1, FADS3, IGF2R, KLOTHO, PRNP, SNAP25, and TBC1D7) and five genomic areas (2q24.1-31.1, 2q31.3, 6p25-21.2, 14q11.2-12, and 6p13.3) were thus included in the association analysis. From the cleaned data set, we selected all genotyped SNPs that lie in these candidate genes and genomic loci including 10 kb both upstream and downstream of each gene or genomic locus.
To increase coverage in the targeted genomic areas, we used the imputation approach implemented in MACH [Li et al., 2006], which imputes genotypes of SNPs that are not directly genotyped in the data set, but that are present on a reference panel. MACH is a Markov Chain-based haplotyper, which obtains an imputation of each unknown genotype using short stretches of DNA that are shared among unrelated individuals. The reference panel used was HapMap III phased data in MACH input format, which is publicly available for download from the MACH website (http://www.sph.umich.edu/csg/abecasis/MaCH/download/).
Genomic coverage of the candidate regions was extended to ~1.5 Mb common SNPs by imputation using the HapMap phase III CEU data (NCBI build 36 (UCSC hg18)) as the reference sample. Imputed SNPs were selected if r2 was above 0.3 with the reference allele. Additionally, a quality threshold of 0.90 for imputation was set to be included in further association testing.
Gene coverage was determined by the sum of the typed and imputed SNPs as well as the tagged SNPs (based on HapMap information) divided by the total known common SNPs (again based on HapMap information) within a gene, using WGAviewer [Ge et al., 2008]. On average, after imputation, gene coverage was 85% in the candidate genes, with 100% coverage for DNAJC13, TBC1D7, DTNBP1, ALDH5A1, BDNF, and CTSD. In total, we analyzed 672 SNPs in the candidate genes and 29451 SNPs in the genomic loci.
We carried out association testing using an additive linear regression model implemented in PLINK for genotyped markers, and in MACH2QTL [Li et al., 2009], for imputed SNPs, taking into account dosage information. All IQ scores were precorrected for sex and age and no other covariates were included in the model. As mentioned above, Z-scores were calculated within each of the different sites included in IMAGE, such that there were no mean differences in IQ between sites. Analyses included only SNPs with a minimum 80% genotyping rate and individuals with <20% of missing genotype data. SNPs in candidate genes that had a nominal P-value <0.05, and the top five SNPs from the genomic regions, were selected for testing in the four replication samples.
Four replication samples totaling 4,357 independent subjects were available for replication of top findings of the IMAGE sample. One sample was ascertained for ADHD, and three samples were population-based samples.
The DUKE cohort consisted of 216 Americans from 108 families with a DSM-IV diagnosed ADHD-affected proband [Kollins et al., 2008]. Families were enrolled from two collection sites: Duke University Medical Center, Durham, NC, and University of North Carolina, Greensboro, NC. All participating family members provided written informed consent that had been approved by the institutional review board at the ascertaining institution. The WAIS-III was administered to individuals 17 years of age or older, and the WISC-IV was given to children ages 6–16. The Wechsler Preschool and Primary Scale of Intelligence—3rd edition (WPPSI-III) was used for children under the age of 6 [Wechsler, 2002]. FSIQ was estimated for both adults and children from the vocabulary and block design subtests (M = 109.5 and SD = 12.9). Parents and children were genotyped using the Illumina Infinium HumanHap300 duo chip (Illumina, Inc., San Diego, CA). Quality of the Illumina data was assessed using PLINK (http://pngu.mgh.harvard.edu/purcell/plink/) [Purcell et al. 2007]. SNPs (315,980) were submitted for quality checks. Call rates exceeded 98% for all individuals, one individual was excluded due to a gender discrepancy, and two individuals were excluded due to per-family Mendelian errors in excess of 1%. Out of the 315,980 SNPs submitted, 6,109 SNPs were excluded based on a MAF <0.05, 13 SNPs were excluded due to Mendelian errors in >4 families, and 629 SNPs were excluded due to deviations from HWE (P < 0.000001). In total, 3 individuals and 6,751 SNPs did not pass our quality control checks. Two Centre d'Etude du Polymorphism Humain (CEPH) controls and blinded duplicates were used for every 94 samples and required to match 100%. Data were genome-wide imputed with the use of the phased data from the HapMap samples (CEU; build 36, release 22) and MACH. Association analysis was carried out using QTDT (http://www.sph.umich.edu/csg/abecasis/QTDT/). QTDT adopts the between/within model as used by Fulker et al.  and Purcell et al.  as implemented in the QFAM package. We tested for population stratification by comparing the between and within family components of association, using a variant of the orthogonal model [Abecasis et al., 2000]. None of the tested SNPs showed sign of stratification in this population.
The Avon Longitudinal Study of Parents and Children (ALSPAC) is a large population-based, prospective birth cohort consisting initially of over 13,000 women and their children recruited from the Bristol area, UK in the early 1990s [Golding et al., 2001]. ALSPAC has extensive data collections on health and development of children and their parents from the 8th gestational week onwards. Ethical approval for the study was obtained from the ALSPAC Law and Ethics Committee and the local research ethics committees. FSIQ within ALSPAC was measured at the age of 8 with the WISC-III [Wechsler et al., 1992]. A short version of the test consisting of alternate items only (except the coding task) was applied by trained psychologists [Joinson et al., 2007]. Verbal (Information, Similarities, Arithmetic, Vocabulary, and Comprehension) and Performance (Picture Completion, Coding, Picture arrangement, Block Design, and Object assembly) subtests were administered; the subtests were scaled and scores for FSIQ derived. ALSPAC (1,543) children were initially genotyped at 317,504 SNPs on the Illumina HumanHap317K SNP chip. Individuals exhibiting cryptic relatedness, non-European ancestry, high genome-wide heterozygosity, and/or missing rates were excluded as described in Timpson et al. , leaving 1,518 individuals in the analysis of whom 1,495 had information on FSIQ within a range of ±4 SD (M = 106.8, SD = 15.6). Markers with MAF <1%, SNPs with >5% missing genotypes and markers that failed an exact test of HWE (P < 5 × 10−6) were excluded from further analyses leaving 310,505 SNPs that passed quality control. GWAS analysis was performed on sex and population stratification-adjusted (first five principal components from Eigenstrat analysis) [Price et al., 2006] Z-standardized IQ scores. Genome-wide imputation was done using the HapMap phase I-II CEU data (release 22, NCBI build 36) as the reference sample and MACH software.
The QIMR adolescent cohort is a population-based cohort, consisting of 1,670 Australians (793 male, 877 female) from 741 families with mean age of 16.4 (SD = 4). FSIQ was assessed with the Multidimensional Aptitude Battery [MAB; Jackson, 1984]. Five subtests were administered (three Verbal: Information, Arithmetic, Vocabulary; two Performance: Spatial, Object Assembly) and from these a standardized FSIQ measure was obtained. FSIQ had a mean of 112.6 (SD = 12.8). Genotyping was done using the Illumina 610K SNP platform and Illumina BeadStudio software, with 529,721 SNPs passing QC. Data were imputed to ~2.3 million SNPs with the use of the phased data from the HapMap samples (CEU; build 36, release 22) and MACH.9, described in detail in Medland et al. , (see Project 5: ADOL deCODE). Individual SNPs were tested for association with the family-based score test implemented in Merlin. This study was approved by the QIMR human research ethics committee and informed written consent was obtained from all participants.
The LBC1936 consisted of 1,091 individuals who, at the age of ~11 years, participated in the Scottish Mental Survey of 1947, when they took a validated mental ability test, the Moray House Test No. 12 (MHT). Briefly, at a mean age of 69.6 years (SD = 0.8) participants of LBC1936 were recruited to a study to investigate the causes of cognitive ageing. They underwent a series of cognitive, physical, and biochemical tests at the Wellcome Trust Clinical Research Facility (WTCRF) at the Western General Hospital, Edinburgh. For this study, a general cognitive ability factor was derived from principal components analysis of six Wechsler Adult Intelligence Scale-IIIUK (WAIS-III) subtests (matrix reasoning, letter number sequencing, block design, symbol search, digit span backwards, and digit symbol), as described previously [Luciano et al., 2009]. The general cognitive ability factor scores were corrected for age in days and sex, and converted to IQ scores (mean = 100; SD = 15). DNA was isolated by standard procedure at the WTCRF Genetics Core, Western General Hospital, Edinburgh from 1,071 individuals. Twenty-nine samples failed quality control preceding the genotyping procedure. The remaining 1,042 samples (all blood-extracted) were genotyped at the WTCRF Genetics Core using the Illumina610-Quadv1 chip. These samples were then subjected to the following quality control procedures after which 1,005 samples remained. All individuals were checked for disagreement between genetic and reported gender (n = 12). Relatedness between subjects was investigated and, for any related pair of individuals, one was removed (n = 8). Samples with a call rate ≤0.95 (n = 16), and those showing evidence of non-Caucasian ascent by multidimensional scaling, were also removed (n = 1). SNPs were included in the analyses if they met the following conditions: call rate ≥0.98, MAF ≥0.01, and HWE test with P ≥ 0.001. The final number of SNPs included in the genome-wide association study was 549,091. IQ scores and genotype were available for 976 individuals. Genomic coverage was extended to ~2.5 million common SNPs by imputation using the HapMap phase II CEU data (NCBI build 36 (UCSC hg18)) as the reference sample and MACH software. SNPs with low imputation (r2 < 0.30), low MAF (<0.01), and divergence from HWE (P < 0.001) were excluded so that respective SNP and sample call rates were 0.98 and 0.95.
The primary (IMAGE) sample of 606 subjects had sufficient (80%) statistical power to detect SNPs that explained at least 1.3% of the variance for direct replication (significance level 0.05) (Genetic Power Calculator) [Purcell et al., 2003], which is in the order of effect sizes of SNPs reported previously. The sample size of the meta-analysis including the two ADHD samples (606 + 216 = 822) was sufficient to detect genetic effects explaining 2% of the variance, given a Bonferroni corrected significance level of 0.001. The sample size including all samples (N = 4,963) was sufficient to detect SNPs explaining 0.35% (i.e., <1%) of the variance (significance level of 0.001).
All populations were imputed using MACH and imputed SNPs were included in our analysis if quality score > 0.9 and r2 > 0.3 and MAF > 0.05. IQ scores were all corrected for effects of age and sex and transformed to Z-scores and standardized such that the mean was 100 and SD = 15, within each sample, for comparison of effect sizes across samples.
Although replication across different samples provides information on the genuineness of an initial association, meta-analysis appropriately weighs the effect and sample sizes across different replications samples. We thus conducted a meta-analysis, in which the primary sample was included to increase statistical power [Skol et al., 2006]. We used a stepwise approach, in which we first ran a combined analysis based on the two samples ascertained for ADHD, and then conducted a meta-analysis on all 4,963 subjects. The meta-analysis was conducted using the METAL program (http://www.sph.umich.edu/csg/abecasis/metal/). METAL creates a single summary P-value for each SNP from all samples together. For each marker, an arbitrary reference allele is selected and a Z-statistic, characterizing the evidence for association, is used as input. The Z-statistic summarizes the magnitude and the direction of an effect relative to the reference allele. An overall Z-statistic and P-value are then calculated from the weighted average of the individual statistics. Weights are proportional to the square root of the number of individuals examined in each sample, and selected such that the squared weights sum to 1.0. Outcomes of the meta-analyses were tested against a Bonferroni corrected threshold of significance (P < 0.001).
Most previously reported associations of genes with intelligence included intronic SNPs with no clear function. This suggests that they might be controlling RNA signaling networks or that other SNPs in LD might be the actual causal variant. We used imputation to increase coverage. We do note; however, that even after imputation, not all of the originally reported SNPs were available in the current sample. Of the 15 candidate genes, six genes showed at least one SNP with a P-value <0.05 (see Table III).
Of the five previously reported genomic loci (2q24.1-31.1, 2q31.3, 6p25-21.2, 14q11.2-12, and 16p13) investigated here, we observed P-values <0.0025 in three regions (6p25-21.2, 2q24.1-31.1, and 14q11.2-12) (see Table IV). Genomic areas 2q31.3 and 16p13.3 showed no association with IQ (all P-values >0.15). On a SNP level, there were three independent SNPs in intergenic and non-coding regions with P-values ≤2.0 × 10−4 inside the 2q24.1-31.1 and 14q11.2-12 areas (Table IV). The lowest P-values were observed for rs2807822, P = 1 × 10−4; rs4972741, P = 1.7 × 10−4; and rs6721348 P = 1.8 × 10−4.
To confirm whether the nominally significant SNPs (P < 0.05) from the candidate genes and the top SNPs (P < 0.0025) in each of the genomic regions with IQ were simply due to chance, we tested these SNPs in the replication samples.
We attempted replication in four independent cohorts. We first performed an association analysis of the 17 nominally associated SNPs (P-value <0.05) in the candidate genes, and the 22 most strongly associated SNPs in the genomic areas in each population (total of 39 SNPs), using the same reference allele for each SNP across different populations. The MAF of the tested SNPs across the five samples were comparable (see Supplementary Table S1).
We first conducted a combined analysis on only the two samples ascertained for ADHD. We then combined all five samples to test whether the significant SNPs were associated with intelligence in a general context, or merely in an ADHD background. Although IQ was normally distributed in both samples ascertained for ADHD, association of a SNP with IQ in an ADHD background may differ from association of that SNP with intelligence in a non-ADHD background.
When combining the two samples ascertained for ADHD we found that of all tested SNPs, 12 had a P-value <0.05 (same direction of effect) of which 6 showed evidence for associated after Bonferroni correction (P < 0.001) for multiple testing. For one of these SNPs (rs2807822, intergenic, 14q11.2-12), however, the effect was in opposite direction in the two samples ascertained for ADHD, also indicated by a significant heterogeneity effect (P = 0.04; see Supplementary Table S2). Three other SNPs were in intergenic areas 6p25-21.2 (one SNP) and 14q11.2-12 (two SNPs), while two SNPs were in genic areas: rs17606174 (P = 0.00018), located in the second intron of ATXN1 (ataxin 1) (MIM: 601556), and rs2023472 (P = 0.0003), located in exon 5 on TRIM31 (tripartite motif-containing 31) (MIM: 609316). Allelic effect sizes were in the order of 3–4 IQ points in the combined DUKE and IMAGE samples. When we combined all five samples, none of these associations were significant, even though some of the SNPs showed similar direction of effects in some of the replication samples. We provide results in Table V.
This study aimed to replicate association of previously reported candidate genes for IQ as well as to fine-map previously linked genomic areas. As available samples differed in ascertainment method (i.e., ascertained for ADHD or population based) we tested for SNP associations with IQ in an ADHD background and in a non-ADHD, general population, background.
In the primary analysis, we found weak evidence for the association of some of the previously reported genes with IQ: IGF2R (five SNPs with P-value <0.05), DTNBP1 (five SNPs with P-value <0.05), ALDHA5A1 (one SNP with P-value <0.05), BDNF (two SNPs with P-value <0.05), DRD2 (two SNPs with P-value <0.05), and CHRM2 (two SNPs with P-value ≤0.03). None of SNPs previously associated with IQ showed association in the current study (P-value >0.05). The lack of replication can either indicate a false positive finding in previous studies, or might be explained by the ascertainment for ADHD in our primary sample. Although association between IQ and ADHD in the current sample was not significant, and IQ was distributed normally in the IMAGE sample, previous reports [e.g., Kuntsi et al., 2004] do indicate a (genetic) association between ADHD and IQ.
Results from the primary association analysis in the genomic loci implicated three intergenic regions (2q24.1-31, 6p25-21.2, and 14q11.2-12). The nominally significant SNPs from the candidate genes, and the top SNPs from the genomic regions, were included in a stepwise combined analysis. When we combined the two samples ascertained for ADHD (totaling 822 subjects), we found that five SNPs were associated with IQ. None of these SNPs were inside candidate genes previously implicated, but instead were located in two genomic areas: 6p25-21.2 and 14q11.2-12. Two of these SNPs were inside two genes: rs17606174 was in the second intron of the ATXN1 gene, and rs2023472 in exon 5 on TRIM31. However, when we combined all samples, none of these SNPs showed a significant association with intelligence. However, we cannot exclude the possibility of type I error given the total number of tests performed within the discovery sample only. These results provide suggestive evidence that the ATXN1 and TRIM31 genes, and several other SNPs in areas 6p25-21.2 and 14q11.2-12, are related to IQ, but only on the background of ADHD.
In the primary IMAGE association results, ATXN1 has 25 SNPs with P-value <0.05, and most of them are located in the second intron of ATXN1, nearby an alternative splicing region. ATXN1 is present in the nucleus of the neurons of the basal ganglia, pons and cortex, and in both cytoplasm and nucleus of Purkinje cells of the cerebellum [Servadio et al., 1995]. Expansion of a (CAG)n repeat in ATXN1 (previous called SCA1 gene) causes spinocerebellar ataxia-1 (SCA1) in humans (MIM: 164400) [Orr et al., 1993; Banfi et al., 1994]. It was also reported that mice lacking ATXN1 are characterized by decreased exploratory behavior, pronounced deficits in the spatial version of the Morris water maze test, and impaired performance on the rotating rod apparatus [Matilla et al., 1998], pointing to the possible role of ATXN1 in learning and memory.
In the primary IMAGE association results, TRIM31 has 23 SNPs with P-value <0.05 and most of them are located in the 5′ region and in intron 1 of TRIM31. The protein encoded by this gene is a member of the tripartite motif (TRIM) family. The TRIM motif includes three zinc-binding domains, a RING, a B-box type 1 and a B-box type 2, and a coiled-coil region [Meroni and Diez-Roux, 2005]. Other members of the TRIM family (TRIM3, MIM: 605493) were reported to modulate NGF-induced neurite outgrowth in PC12 cells [El-Husseini and Vincent, 1999].
In summary, we found very little support for genetic variants in genes that have previously been associated with intelligence. In addition, this study did provide tentative support for a role of the ATXN1 and TRIM31 genes in previously associated linkage areas for intelligence in the context of a psychiatric disorder, that is, ADHD. This suggests that genetic variants important for IQ in a non-psychiatric population may not necessary overlap with genetic variants important for IQ in a psychiatric population.
We thank all the persons who kindly participated in this research. The IMAGE project was supported by National Institutes of Health (NIH) grants R01MH081803 and R01MH62873 to S.V. Faraone. Site Principal Investigators are Philip Asherson, Tobias Banaschewski, Jan Buitelaar, Richard P. Ebstein, Stephen V. Faraone, Michael Gill, Ana Miranda, Fernando Mulas, Robert D. Oades, Herbert Roeyers, Aribert Rothenberger, Joseph Sergeant, Edmund Sonuga-Barke, and Hans-Christoph Steinhausen. Senior coinvestigators are Margaret Thompson, Pak Sham, Peter McGuffin, Robert Plomin, Ian Craig, and Eric Taylor. Chief Investigators at each site are Rafaela Marco, Nanda Rommelse, Wai Chen, Henrik Uebel, Hanna Christiansen, Ueli Mueller, Marieke Altink, Barbara Franke, and Lamprini Psychogiou. The data set(s) used for the analyses described in this manuscript were obtained from the dbGaP Database through dbGaP accession number phs000016.v2.p2. Samples and associated phenotype data for Whole Genome Association Study of Attention Deficit Hyperactivity Disorder were provided by S. Faraone. The cognitive assessments in London were funded by UK Medical Research Council grant G03001896 to Jonna Kuntsi. We are extremely grateful to all the families who took part in this study, the midwives for their help in recruiting them, and the whole ALSPAC team, which includes interviewers, computer and laboratory technicians, clerical workers, research scientists, volunteers, managers, receptionists, and nurses. The UK Medical Research Council, the Wellcome Trust and the University of Bristol provide core support for ALSPAC. LBC1936 research was supported by a programme grant from Research Into Ageing. The research continues with programme grants from Help the Aged/Research Into Ageing (Disconnected Mind). The study was conducted within the University of Edinburgh Centre for Cognitive Ageing and Cognitive Epidemiology, supported by the Biotechnology and Biological Sciences Research Counci1 (BBSRC), Engineering and Physical Sciences Research Council (EPSRC), Economic and Social Research Council (ESRC), and Medical Research Council (MRC), as part of the cross-council Lifelong Health and Wellbeing Initiative (G0700704/84698). QIMR group would like to thank Marlene Grace, Ann Eldridge, and Kerrie McAloney for sample collection; Anjali Henders, Megan Campbell, Lisa Bowdler, Steven Crooks, and staff of the Molecular Epidemiology Laboratory for sample processing and preparation; Harry Beeby, David Smyth, and Daniel Park for IT support. They acknowledge support from the Australian Research Council (A7960034, A79906588, A79801419, DP0212016, and DP0343921). Genotyping was funded by the National Health and Medical Research Council (Medical Bioinformatics Genomics Proteomics Program, 389891). Further, they gratefully acknowledge Dale R. Nyholt and especially Scott Gordon for their substantial efforts involving the QC and preparation of the GWA data sets, and Sarah Medland for undertaking the imputation of the GWAS data and preparation of these data for analysis. Statistical analyses from IMAGE, ALSPAC, and QIMR group were carried out on the Genetic Cluster Computer (http://www.geneticcluster.org), which is financially supported by the Netherlands Scientific Organization (NWO 480-05-003). Dina Ruano is supported by the Portuguese Foundation for Science and Technology under grant number SFRH/BPD/28725/2006. Sophie van der Sluis (VENI-451-08-025) is financially supported by the Netherlands Scientific Organization (Nederlandse Organisatie voor Wetenschappelijk Onderzoek, gebied Maatschappij-en Gedragswetenschappen: NWO/MaGW). We further wish to acknowledge the financial support of NWO-VI-016-065-318, and the Center for Neurogenomics and Cognitive research (CNCR), as well as the National Institutes of Health (NS049067) for the DUKE project.