Researchers have previously shown that individual differences in measures of receptive language ability at age 12 are highly heritable. In the current study, the authors attempted to identify some of the genes responsible for the heritability of receptive language ability using a genome-wide association approach.
The authors administered 4 Internet-based measures of receptive language (vocabulary, semantics, syntax, and pragmatics) to a sample of 2,329 twelve-year-olds for whom DNA and genome-wide genotyping were available. Nearly 700,000 single-nucleotide polymorphisms (SNPs) and 1 million imputed SNPs were included in a genome-wide association analysis of receptive language composite scores.
No SNP associations met the demanding criterion of genome-wide significance that corrects for multiple testing across the genome (p < 5 × 10–8). The strongest SNP association did not replicate in an additional sample of 2,639 twelve-year-olds.
These results indicate that individual differences in receptive language ability in the general population do not reflect common genetic variants that account for more than 3% of the phenotypic variance. The search for genetic variants associated with language skill will require larger samples and additional methods to identify and functionally characterize the full spectrum of risk variants.
receptive language; adolescents; genome-wide association study; genetics
Studying the causes and correlates of natural variation in gene expression in healthy populations assumes that individual differences in gene expression can be reliably and stably assessed across time. However, this is yet to be established. We examined 4-hour test–retest reliability and 10 month test–retest stability of individual differences in gene expression in ten 12-year-old children. Blood was collected on four occasions: 10 a.m. and 2 p.m. on Day 1 and 10 months later at 10 a.m. and 2 p.m. Total RNA was hybridized to Affymetrix-U133 plus 2.0 arrays. For each probeset, the correlation across individuals between 10 a.m. and 2 p.m. on Day 1 estimates test–retest reliability. We identified 3,414 variable and abundantly expressed probesets whose 4-hour test-retest reliability exceeded .70, a conventionally accepted level of reliability, which we had 80% power to detect. Of the 3,414 reliable probesets, 1,752 were also significantly reliable 10 months later. We assessed the long-term stability of individual differences in gene expression by correlating the average expression level for each probe-set across the two 4-hour assessments on Day 1 with the average level of each probe-set across the two 4-hour assessments 10 months later. 1,291 (73.7%) of the 1,752 probe-sets that reliably detected individual differences across 4 hours on two occasions, 10 months apart, also stably detected individual differences across 10 months. Heritability, as estimated from the MZ twin intraclass correlations, is twice as high for the 1,752 reliable probesets versus all present probesets on the array (0.68 vs 0.34), and is even higher (0.76) for the 1,291 reliable probesets that are also stable across 10 months. The 1,291 probesets that reliably detect individual differences from a single peripheral blood collection and stably detect individual differences over 10 months are promising targets for research on the causes (e.g., eQTLs) and correlates (e.g., psychopathology) of individual differences in gene expression.
blood; human; individual differences; gene expression; reliability; genomewide
Twin studies suggest that expressive vocabulary at ~24 months is modestly heritable. However, the genes influencing this early linguistic phenotype are unknown. Here we conduct a genome-wide screen and follow-up study of expressive vocabulary in toddlers of European descent from up to four studies of the EArly Genetics and Lifecourse Epidemiology (EAGLE) consortium, analysing an early (15-18 months, ‘one-word stage’, NTotal=8,889) and a later (24-30 months, ‘two-word stage’, NTotal=10,819) phase of language acquisition. For the early phase, one SNP (rs7642482) at 3p12.3 near ROBO2, encoding a conserved axon binding receptor, reaches the genome-wide significance level (p=1.3×10−8) in the combined sample. This association links language-related common genetic variation in the general population to a potential autism susceptibility locus and a linkage region for dyslexia, speech-sound disorder and reading. The contribution of common genetic influences is, although modest, supported by Genome-wide Complex Trait Analysis (meta-GCTA h215-18-months=0.13, meta-GCTA h224-30-months=0.14) and in concordance with additional twin analysis (5,733 pairs of European descent, h224-months=0.20).
Twin studies suggest that expressive vocabulary at ~24 months is modestly heritable. However, the genes influencing this early linguistic phenotype are unknown. Here we conduct a genome-wide screen and follow-up study of expressive vocabulary in toddlers of European descent from up to four studies of the EArly Genetics and Lifecourse Epidemiology consortium, analysing an early (15–18 months, ‘one-word stage’, NTotal=8,889) and a later (24–30 months, ‘two-word stage’, NTotal=10,819) phase of language acquisition. For the early phase, one single-nucleotide polymorphism (rs7642482) at 3p12.3 near ROBO2, encoding a conserved axon-binding receptor, reaches the genome-wide significance level (P=1.3 × 10−8) in the combined sample. This association links language-related common genetic variation in the general population to a potential autism susceptibility locus and a linkage region for dyslexia, speech-sound disorder and reading. The contribution of common genetic influences is, although modest, supported by genome-wide complex trait analysis (meta-GCTA h215–18-months=0.13, meta-GCTA h224–30-months=0.14) and in concordance with additional twin analysis (5,733 pairs of European descent, h224-months=0.20).
The genetic basis of expressive vocabulary in children around 2 years old is poorly understood. Here, the authors show that a genetic variant near the ROBO2 gene is associated with early language acquisition in the general population and highlight a potential genetic link between language-related common genetic variation and a linkage region for dyslexia, speech-sound disorder and reading.
Dissecting how genetic and environmental influences impact on learning is helpful for maximizing numeracy and literacy. Here we show, using twin and genome-wide analysis, that there is a substantial genetic component to children’s ability in reading and mathematics, and estimate that around one half of the observed correlation in these traits is due to shared genetic effects (so-called Generalist Genes). Thus, our results highlight the potential role of the learning environment in contributing to differences in a child’s cognitive abilities at age twelve.
Understanding the genetic basis of cognitive traits could aid the development of numeracy and literacy skills in children. Here the authors show that reading and mathematics have a large overlapping genetic component and suggest that a child's learning environment has a key role in creating differences between them.
Psychosis has been hypothesised to be a continuously distributed quantitative phenotype and disorders such as schizophrenia and bipolar disorder represent its extreme manifestations. Evidence suggests that common genetic variants play an important role in liability to both schizophrenia and bipolar disorder. Here we tested the hypothesis that these common variants would also influence psychotic experiences measured dimensionally in adolescents in the general population. Our aim was to test whether schizophrenia and bipolar disorder polygenic risk scores (PRS), as well as specific single nucleotide polymorphisms (SNPs) previously identified as risk variants for schizophrenia, were associated with adolescent dimension-specific psychotic experiences. Self-reported Paranoia, Hallucinations, Cognitive Disorganisation, Grandiosity, Anhedonia, and Parent-rated Negative Symptoms, as measured by the Specific Psychotic Experiences Questionnaire (SPEQ), were assessed in a community sample of 2,152 16-year-olds. Polygenic risk scores were calculated using estimates of the log of odds ratios from the Psychiatric Genomics Consortium GWAS stage-1 mega-analysis of schizophrenia and bipolar disorder. The polygenic risk analyses yielded no significant associations between schizophrenia and bipolar disorder PRS and the SPEQ measures. The analyses on the 28 individual SNPs previously associated with schizophrenia found that two SNPs in TCF4 returned a significant association with the SPEQ Paranoia dimension, rs17512836 (p-value = 2.57×10−4) and rs9960767 (p-value = 6.23×10−4). Replication in an independent sample of 16-year-olds (N = 3,427) assessed using the Psychotic-Like Symptoms Questionnaire (PLIKS-Q), a composite measure of multiple positive psychotic experiences, failed to yield significant results. Future research with PRS derived from larger samples, as well as larger adolescent validation samples, would improve the predictive power to test these hypotheses further. The challenges of relating adult clinical diagnostic constructs such as schizophrenia to adolescent psychotic experiences at a genetic level are discussed.
Callous-unemotional behavior (CU) is currently under consideration as a subtyping index for conduct disorder diagnosis. Twin studies routinely estimate the heritability of CU as greater than 50%. It is now possible to estimate genetic influence using DNA alone from samples of unrelated individuals, not relying on the assumptions of the twin method. Here we use this new DNA method (implemented in a software package called Genome-wide Complex Trait Analysis, GCTA) for the first time to estimate genetic influence on CU. We also report the first genome-wide association (GWA) study of CU as a quantitative trait. We compare these DNA results to those from twin analyses using the same measure and the same community sample of 2,930 children rated by their teachers at ages 7, 9 and 12. GCTA estimates of heritability were near zero, even though twin analysis of CU in this sample confirmed the high heritability of CU reported in the literature, and even though GCTA estimates of heritability were substantial for cognitive and anthropological traits in this sample. No significant associations were found in GWA analysis, which, like GCTA, only detects additive effects of common DNA variants. The phrase ‘missing heritability’ was coined to refer to the gap between variance associated with DNA variants identified in GWA studies versus twin study heritability. However, GCTA heritability, not twin study heritability, is the ceiling for GWA studies because both GCTA and GWA are limited to the overall additive effects of common DNA variants, whereas twin studies are not. This GCTA ceiling is very low for CU in our study, despite its high twin study heritability estimate. The gap between GCTA and twin study heritabilities will make it challenging to identify genes responsible for the heritability of CU.
Twin studies have shown that anxiety in a general population sample of children involves both domain-general and trait-specific genetic effects. For this reason, in an attempt to identify genes responsible for these effects, we investigated domain-general and trait-specific genetic associations in the first genome-wide association (GWA) study on anxiety-related behaviours (ARBs) in childhood.
The sample included 2810 7-year-olds drawn from the Twins Early Development Study (TEDS) with data available for parent-rated anxiety and genome-wide DNA markers. The measure was the Anxiety-Related Behaviours Questionnaire (ARBQ), which assesses four anxiety traits and also yields a general anxiety composite. Affymetrix GeneChip 6.0 DNA arrays were used to genotype nearly 700,000 single-nucleotide polymorphisms (SNPs), and IMPUTE v2 was used to impute more than 1 million SNPs. Several GWA associations from this discovery sample were followed up in another TEDS sample of 4804 children. In addition, Genome-wide Complex Trait Analysis (GCTA) was used on the discovery sample, to estimate the total amount of variance in ARBs that can be accounted for by SNPs on the array.
No SNP associations met the demanding criterion of genome-wide significance that corrects for multiple testing across the genome (p<5×10−8). Attempts to replicate the top associations did not yield significant results. In contrast to the substantial twin study estimates of heritability which ranged from 0.50 (0.03) to 0.61 (0.01), the GCTA estimates of phenotypic variance accounted for by the SNPs were much lower 0.01 (0.11) to 0.19 (0.12).
Taken together, these GWAS and GCTA results suggest that anxiety – similar to height, weight and intelligence − is affected by many genetic variants of small effect, but unlike these other prototypical polygenic traits, genetic influence on anxiety is not well tagged by common SNPs.
Gene Set Enrichment (GSE) is a computational technique which determines whether a priori defined set of genes show statistically significant differential expression between two phenotypes. Currently, the gene sets used for GSE are derived from annotation or pathway databases, which often contain computationally based and unrepresentative data. Here, we propose a novel approach for the generation of comprehensive and biologically derived gene sets, deriving sets through the application of machine learning techniques to gene expression data. These gene sets can be produced for specific tissues, developmental stages or environments. They provide a powerful and functionally meaningful way in which to mine genomewide association and next generation sequencing data in order to identify disease-associated variants and pathways.
gene set enrichment; annotation database; gene expression data; machine learning; next generation sequencing
Across the genome, outside of a small number of known imprinted genes and regions subject to X-inactivation in females, DNA methylation at CpG dinucleotides is often assumed to be complementary across both alleles in a diploid cell. However, recent findings suggest the reality is more complex, with the discovery that allele-specific methylation (ASM) is a common feature across the human genome. A key observation is that the majority of ASM is associated with genetic variation in cis, although a noticeable proportion is also non-cis in nature and mediated, for example, by parental origin. ASM appears to be both quantitative, characterized by subtle skewing of DNA methylation between alleles, and heterogeneous, varying across tissues and between individuals. These findings have important implications for complex disease genetics; while cis-mediated ASM provides a functional consequence for non-coding genetic variation, heterogeneous and quantitative ASM complicates the identification of disease-associated loci. We propose that non-cis ASM could contribute toward the “missing heritability” of complex diseases, rendering certain loci hemizygous and masking the direct association between genotype and phenotype. We suggest that the interpretation of results from genome-wide association studies can be improved by the incorporation of epi-allelic information and that in order to fully understand the extent and consequence of ASM in the human genome, a comprehensive sequencing-based analysis of allelic methylation patterns across tissues and individuals is required.
DNA methylation; allelespecific methylation; allele-specific expression; tissue-specific methylation; epigenetics; imprinting; genome-wide association study (GWAS); genetics; complex disease; missing heritability
Childhood general cognitive ability (g) is important for a wide range of outcomes in later life, from school achievement to occupational success and life expectancy. Large-scale association studies will be essential in the quest to identify variants that make up the substantial genetic component implicated by quantitative genetic studies. We conducted a three-stage genome-wide association study for general cognitive ability using over 350,000 single nucleotide polymorphisms (SNPs) in the quantitative extremes of a population sample of 7,900 7-year-old children from the UK Twins Early Development Study. Using two DNA pooling stages to enrich true positives, each of around 1,000 children selected from the extremes of the distribution, and a third individual genotyping stage of over 3,000 children to test for quantitative associations across the normal range, we aimed to home in on genes of small effect. Genome-wide results suggested that our approach was successful in enriching true associations and 28 SNPs were taken forward to individual genotyping in an unselected population sample. However, although we found an enrichment of low P values and identified nine SNPs nominally associated with g (P < 0.05) that show interesting characteristics for follow-up, further replication will be necessary to meet rigorous standards of association. These replications may take advantage of SNP sets to overcome limitations of statistical power. Despite our large sample size and three-stage design, the genes associated with childhood g remain tantalizingly beyond our current reach, providing further evidence for the small effect sizes of individual loci. Larger samples, denser arrays and multiple replications will be necessary in the hunt for the genetic variants that influence human cognitive ability.
Electronic supplementary material
The online version of this article (doi:10.1007/s10519-010-9350-4) contains supplementary material, which is available to authorized users.
Genetics; Genome-wide association; General cognitive ability; Intelligence; Population sample; Middle childhood
For nearly a century, twin and adoption studies have yielded substantial estimates of heritability for cognitive abilities, although it has proved difficult for genomewide-association studies to identify the genetic variants that account for this heritability (i.e., the missing-heritability problem). However, a new approach, genomewide complex-trait analysis (GCTA), forgoes the identification of individual variants to estimate the total heritability captured by common DNA markers on genotyping arrays. In the same sample of 3,154 pairs of 12-year-old twins, we directly compared twin-study heritability estimates for cognitive abilities (language, verbal, nonverbal, and general) with GCTA estimates captured by 1.7 million DNA markers. We found that DNA markers tagged by the array accounted for .66 of the estimated heritability, reaffirming that cognitive abilities are heritable. Larger sample sizes alone will be sufficient to identify many of the genetic variants that influence cognitive abilities.
cognitive ability; behavioral genetics; cognitive development; genetics
Twin-study research suggests that many (but not all) of the same genes contribute to genetic influence on diverse learning abilities and disabilities, a hypothesis called generalist genes. This generalist genes hypothesis was tested using a set of 10 DNA markers (single nucleotide polymorphisms [SNPs]) found to be associated with early reading ability in a study of 4,258 7-year-old children that screened 100,000 SNPs. Using the same sample, we show that this early reading SNP set also correlates with other aspects of literacy, components of mathematics, and more general cognitive abilities. These results provide support for the generalist genes hypothesis. Although the effect size of the current SNP set is small, such SNP sets could eventually be used to predict genetic risk for learning disabilities as well as to prescribe genetically tailored intervention and prevention programs.