Although genome-wide association studies (GWAS) have identified many common variants associated with complex traits, low-frequency and rare variants have not been interrogated in a comprehensive manner. Imputation from dense reference panels, such as the 1000 Genomes Project (1000G), enables testing of ungenotyped variants for association. Here we present the results of imputation using a large, new population-specific panel: the Genome of The Netherlands (GoNL). We benchmarked the performance of the 1000G and GoNL reference sets by comparing imputation genotypes with ‘true' genotypes typed on ImmunoChip in three European populations (Dutch, British, and Italian). GoNL showed significant improvement in the imputation quality for rare variants (MAF 0.05–0.5%) compared with 1000G. In Dutch samples, the mean observed Pearson correlation, r2, increased from 0.61 to 0.71. We also saw improved imputation accuracy for other European populations (in the British samples, r2 improved from 0.58 to 0.65, and in the Italians from 0.43 to 0.47). A combined reference set comprising 1000G and GoNL improved the imputation of rare variants even further. The Italian samples benefitted the most from this combined reference (the mean r2 increased from 0.47 to 0.50). We conclude that the creation of a large population-specific reference is advantageous for imputing rare variants and that a combined reference panel across multiple populations yields the best imputation results.
genotype imputation; GWAS; GoNL; rare variants; reference sets; reference panel
Plasma fibrinogen is an acute phase protein playing an important role in the blood coagulation cascade having strong associations with smoking, alcohol consumption and body mass index (BMI). Genome-wide association studies (GWAS) have identified a variety of gene regions associated with elevated plasma fibrinogen concentrations. However, little is yet known about how associations between environmental factors and fibrinogen might be modified by genetic variation. Therefore, we conducted large-scale meta-analyses of genome-wide interaction studies to identify possible interactions of genetic variants and smoking status, alcohol consumption or BMI on fibrinogen concentration. The present study included 80,607 subjects of European ancestry from 22 studies. Genome-wide interaction analyses were performed separately in each study for about 2.6 million single nucleotide polymorphisms (SNPs) across the 22 autosomal chromosomes. For each SNP and risk factor, we performed a linear regression under an additive genetic model including an interaction term between SNP and risk factor. Interaction estimates were meta-analysed using a fixed-effects model. No genome-wide significant interaction with smoking status, alcohol consumption or BMI was observed in the meta-analyses. The most suggestive interaction was found for smoking and rs10519203, located in the LOC123688 region on chromosome 15, with a p value of 6.2×10−8. This large genome-wide interaction study including 80,607 participants found no strong evidence of interaction between genetic variants and smoking status, alcohol consumption or BMI on fibrinogen concentrations. Further studies are needed to yield deeper insight in the interplay between environmental factors and gene variants on the regulation of fibrinogen concentrations.
Heart rate variability is an important risk factor for cardiovascular disease and all-cause mortality. The acetylcholine pathway plays a key role in explaining heart rate variability in humans. We assessed whether 443 genotyped and imputed common genetic variants in eight key genes (CHAT, SLC18A3, SLC5A7, CHRNB4, CHRNA3, CHRNA, CHRM2 and ACHE) of the acetylcholine pathway were associated with variation in an established measure of heart rate variability reflecting parasympathetic control of the heart rhythm, the root mean square of successive differences (RMSSD) of normal RR intervals. The association was studied in a two stage design in individuals of European descent. First, analyses were performed in a discovery sample of four cohorts (n = 3429, discovery stage). Second, findings were replicated in three independent cohorts (n = 3311, replication stage), and finally the two stages were combined in a meta-analysis (n = 6740). RMSSD data were obtained under resting conditions. After correction for multiple testing, none of the SNPs showed an association with RMSSD. In conclusion, no common genetic variants for heart rate variability were identified in the largest and most comprehensive candidate gene study on the acetylcholine pathway to date. Future gene finding efforts for RMSSD may want to focus on hypothesis free approaches such as the genome-wide association study.
We assessed gene expression profiles in 2,752 twins, using a classic twin design to quantify expression heritability and quantitative trait loci (eQTL) in peripheral blood. The most highly heritable genes (~777) were grouped into distinct expression clusters, enriched in gene-poor regions, associated with specific gene function/ontology classes, and strongly associated with disease designation. The design enabled a comparison of twin-based heritability to estimates based on dizygotic IBD sharing and distant genetic relatedness. Consideration of sampling variation suggests that previous heritability estimates have been upwardly biased. Genotyping of 2,494 twins enabled powerful identification of eQTLs, which were further examined in a replication set of 1,895 unrelated subjects. A large number of local eQTLs (6,988) met replication criteria, while a relatively small number of distant eQTLs (165) met quality control and replication standards. Our results provide an important new resource toward understanding the genetic control of transcription.
gene expression; peripheral blood; twin study; heritability; expression quantitative trait loci; eQTL
Genetic variation in a population can be summarized through principal component analysis (PCA) on genome-wide data. PCs derived from such analyses are valuable for genetic association studies, where they can correct for population stratification. We investigated how to capture the genetic population structure in a well-characterized sample from the Netherlands and in a worldwide data set and examined whether (1) removing long-range linkage disequilibrium (LD) regions and LD-based SNP pruning significantly improves correlations between PCs and geography and (2) whether genetic differentiation may have been influenced by migration and/or selection. In the Netherlands, three PCs showed significant correlations with geography, distinguishing between: (1) North and South; (2) East and West; and (3) the middle-band and the rest of the country. The third PC only emerged with minimized LD, which also significantly increased correlations with geography for the other two PCs. In addition to geography, the Dutch North–South PC showed correlations with genome-wide homozygosity (r=0.245), which may reflect a serial-founder effect due to northwards migration, and also with height (♂: r=0.142, ♀: r=0.153). The divergence between subpopulations identified by PCs is partly driven by selection pressures. The first three PCs showed significant signals for diversifying selection (545 SNPs - the majority within 184 genes). The strongest signal was observed between North and South for the functional SNP in HERC2 that determines human blue/brown eye color. Thus, this study demonstrates how to increase ancestry signals in a relatively homogeneous population and how those signals can reveal evolutionary history.
PCA; linkage disequilibrium; population structure; migration; diversifying selection; Netherlands
The effects of inbreeding on the health of offspring can be studied by measuring genome-wide autozygosity as the proportion of the genome in runs of homozygosity (Froh) and relate Froh to outcomes such as psychiatric phenotypes. To successfully conduct these studies, the main patterns of variation for genome-wide autozygosity between and within populations should be well understood and accounted for. Within population variation was investigated in the Dutch population by comparing autozygosity between religious and non-religious groups. The Netherlands have a history of societal segregation and assortment based on religious affiliation, which may have increased parental relatedness within religious groups. Religion has been associated with several psychiatric phenotypes, such as major depressive disorder (MDD). We investigated whether there is an association between autozygosity and MDD, and the extent to which this association can be explained by religious affiliation. All Froh analyses included adjustment for ancestry-informative principal components (PCs) and geographic factors.
Religious affiliation was significantly associated with autozygosity, showing that Froh has the ability to capture within population differences that are not captured by ancestry-informative PCs or geographic factors. The non-religious group had significantly lower Froh values and significantly more MDD cases, leading to a nominally significant negative association between autozygosity and depression. After accounting for religious affiliation, MDD was not associated with Froh, indicating that the relation between MDD and inbreeding was due to stratification.
This study shows how past religious assortment and recent secularization can have genetic consequences in a relatively small country. This warrants accounting for the historical social context and its effects on genetic variation in association studies on psychiatric and other related traits.
autozygosity; runs of homozygosity; major depressive disorder; religion; population stratification; assortative mating
Several linkage studies on anxiety have been carried out in samples ascertained through probands with panic disorder. The results indicated that using a broad anxiety phenotype instead of a DSM-IV anxiety disorder diagnosis might enhance the chance of finding a linkage signal. In the current study, a genome-wide linkage analysis was performed on anxiety measured with a self-report questionnaire whose scores are highly correlated with DSM-IV anxiety disorders. The self-report questionnaire was included in five surveys of a longitudinal study of the Netherlands Twin Register. Genotype and phenotype data were available for 1,602 twins and siblings. To estimate Identity By Descent (IBD), additional genotype data for 564 parents and 22 siblings were used. Linkage analyses were carried out using MERLIN-Regress on the average anxiety scores across time. A linkage signal (LOD-score 3.4, empirical p-value 0.07) was obtained at chromosome 14 for marker D14S65 at 105 cM (90% confidence interval 99 cM - 115 cM bounded by markers D14S1434 and D14S985). This finding replicates a linkage finding for a broad anxiety phenotype in a clinically based sample, indicating that the region might harbor a QTL associated with the whole spectrum of general anxiety, i.e. from the normal to the clinical range. Moreover, genome-wide linkage and association studies on emotionality in mice obtained significant results in a syntenic region on mouse chromosome 12. Two homolog genes lie in this region –Dlk1 (delta-like 1 homolog, Drosophila) and Rtl1 (retrotransposon-like 1). Future association studies of these genes are warranted.
anxiety; genomewide linkage; family study; stai; genetics
Personality traits are complex phenotypes related to psychosomatic health. Individually, various gene finding methods have not achieved much success in finding genetic variants associated with personality traits. We performed a meta-analysis of four genome-wide linkage scans (N=6149 subjects) of five basic personality traits assessed with the NEO Five-Factor Inventory. We compared the significant regions from the meta-analysis of linkage scans with the results of a meta-analysis of genome-wide association studies (GWAS) (N∼17 000). We found significant evidence of linkage of neuroticism to chromosome 3p14 (rs1490265, LOD=4.67) and to chromosome 19q13 (rs628604, LOD=3.55); of extraversion to 14q32 (ATGG002, LOD=3.3); and of agreeableness to 3p25 (rs709160, LOD=3.67) and to two adjacent regions on chromosome 15, including 15q13 (rs970408, LOD=4.07) and 15q14 (rs1055356, LOD=3.52) in the individual scans. In the meta-analysis, we found strong evidence of linkage of extraversion to 4q34, 9q34, 10q24 and 11q22, openness to 2p25, 3q26, 9p21, 11q24, 15q26 and 19q13 and agreeableness to 4q34 and 19p13. Significant evidence of association in the GWAS was detected between openness and rs677035 at 11q24 (P-value=2.6 × 10−06, KCNJ1). The findings of our linkage meta-analysis and those of the GWAS suggest that 11q24 is a susceptible locus for openness, with KCNJ1 as the possible candidate gene.
personality; KCNJ1; NEO; linkage; GSMA
Obesity is of global health concern. There are well-described inverse relationships between female pubertal timing and obesity. Recent genome-wide association studies of age at menarche identified several obesity-related variants. Using data from the ReproGen Consortium, we employed meta-analytical techniques to estimate the associations of 95 a priori and recently identified obesity-related (body mass index (weight (kg)/height (m)2), waist circumference, and waist:hip ratio) single-nucleotide polymorphisms (SNPs) with age at menarche in 92,116 women of European descent from 38 studies (1970–2010), in order to estimate associations between genetic variants associated with central or overall adiposity and pubertal timing in girls. Investigators in each study performed a separate analysis of associations between the selected SNPs and age at menarche (ages 9–17 years) using linear regression models and adjusting for birth year, site (as appropriate), and population stratification. Heterogeneity of effect-measure estimates was investigated using meta-regression. Six novel associations of body mass index loci with age at menarche were identified, and 11 adiposity loci previously reported to be associated with age at menarche were confirmed, but none of the central adiposity variants individually showed significant associations. These findings suggest complex genetic relationships between menarche and overall obesity, and to a lesser extent central obesity, in normal processes of growth and development.
adiposity; body mass index; genetic association studies; menarche; obesity; waist circumference; waist:hip ratio; women's health
Elevated resting heart rate is associated with greater risk of cardiovascular disease and mortality. In a 2-stage meta-analysis of genome-wide association studies in up to 181,171 individuals, we identified 14 new loci associated with heart rate and confirmed associations with all 7 previously established loci. Experimental downregulation of gene expression in Drosophila melanogaster and Danio rerio identified 20 genes at 11 loci that are relevant for heart rate regulation and highlight a role for genes involved in signal transmission, embryonic cardiac development and the pathophysiology of dilated cardiomyopathy, congenital heart failure and/or sudden cardiac death. In addition, genetic susceptibility to increased heart rate is associated with altered cardiac conduction and reduced risk of sick sinus syndrome, and both heart rate–increasing and heart rate–decreasing variants associate with risk of atrial fibrillation. Our findings provide fresh insights into the mechanisms regulating heart rate and identify new therapeutic targets.
Approaches exploiting extremes of the trait distribution may reveal novel loci for common traits, but it is unknown whether such loci are generalizable to the general population. In a genome-wide search for loci associated with upper vs. lower 5th percentiles of body mass index, height and waist-hip ratio, as well as clinical classes of obesity including up to 263,407 European individuals, we identified four new loci (IGFBP4, H6PD, RSRC1, PPP2R2A) influencing height detected in the tails and seven new loci (HNF4G, RPTOR, GNAT2, MRPS33P4, ADCY9, HS6ST3, ZZZ3) for clinical classes of obesity. Further, we show that there is large overlap in terms of genetic structure and distribution of variants between traits based on extremes and the general population and little etiologic heterogeneity between obesity subgroups.
The genetic contribution to the variation in human lifespan is ∼25%. Despite the large number of identified disease-susceptibility loci, it is not known which loci influence population mortality. We performed a genome-wide association meta-analysis of 7729 long-lived individuals of European descent (≥85 years) and 16 121 younger controls (<65 years) followed by replication in an additional set of 13 060 long-lived individuals and 61 156 controls. In addition, we performed a subset analysis in cases aged ≥90 years. We observed genome-wide significant association with longevity, as reflected by survival to ages beyond 90 years, at a novel locus, rs2149954, on chromosome 5q33.3 (OR = 1.10, P = 1.74 × 10−8). We also confirmed association of rs4420638 on chromosome 19q13.32 (OR = 0.72, P = 3.40 × 10−36), representing the TOMM40/APOE/APOC1 locus. In a prospective meta-analysis (n = 34 103), the minor allele of rs2149954 (T) on chromosome 5q33.3 associates with increased survival (HR = 0.95, P = 0.003). This allele has previously been reported to associate with low blood pressure in middle age. Interestingly, the minor allele (T) associates with decreased cardiovascular mortality risk, independent of blood pressure. We report on the first GWAS-identified longevity locus on chromosome 5q33.3 influencing survival in the general European population. The minor allele of this locus associates with low blood pressure in middle age, although the contribution of this allele to survival may be less dependent on blood pressure. Hence, the pleiotropic mechanisms by which this intragenic variation contributes to lifespan regulation have to be elucidated.
Neuronal nicotinic acetylcholine receptor (nAChR) genes (CHRNA5/CHRNA3/CHRNB4) have been reproducibly associated with nicotine dependence, smoking behaviors, and lung cancer risk. Of the few reports that have focused on early smoking behaviors, association results have been mixed. This meta-analysis examines early smoking phenotypes and SNPs in the gene cluster to determine: (1) whether the most robust association signal in this region (rs16969968) for other smoking behaviors is also associated with early behaviors, and/or (2) if additional statistically independent signals are important in early smoking. We focused on two phenotypes: age of tobacco initiation (AOI) and age of first regular tobacco use (AOS). This study included 56,034 subjects (41 groups) spanning nine countries and evaluated five SNPs including rs1948, rs16969968, rs578776, rs588765, and rs684513. Each dataset was analyzed using a centrally generated script. Meta-analyses were conducted from summary statistics. AOS yielded significant associations with SNPs rs578776 (beta = 0.02, P = 0.004), rs1948 (beta = 0.023, P = 0.018), and rs684513 (beta = 0.032, P = 0.017), indicating protective effects. There were no significant associations for the AOI phenotype. Importantly, rs16969968, the most replicated signal in this region for nicotine dependence, cigarettes per day, and cotinine levels, was not associated with AOI (P = 0.59) or AOS (P = 0.92). These results provide important insight into the complexity of smoking behavior phenotypes, and suggest that association signals in the CHRNA5/A3/B4 gene cluster affecting early smoking behaviors may be different from those affecting the mature nicotine dependence phenotype.
CHRNA5; CHRNA3; CHRNB4; meta-analysis; nicotine; smoke
Purpose. Twin studies provide evidence that genetic influences contribute strongly to individual differences in exercise behavior. We hypothesize that part of this heritability is explained by genetic variation in the dopaminergic reward system. Eight single nucleotide polymorphisms (SNPs in DRD1: rs265981, DRD2: rs6275, rs1800497, DRD3: rs6280, DRD4: rs1800955, DBH: rs1611115, rs2519152, and in COMT: rs4680) and three variable number of tandem repeats (VNTRs in DRD4, upstream of DRD5, and in DAT1) were investigated for an association with regular leisure time exercise behavior. Materials and Methods. Data on exercise activities and at least one SNP/VNTR were available for 8,768 individuals aged 7 to 50 years old that were part of the Netherlands Twin Register. Exercise behavior was quantified as weekly metabolic equivalents of task (MET) spent on exercise activities. Mixed models were fitted in SPSS with genetic relatedness as a random effect. Results. None of the genetic variants were associated with exercise behavior (P > .02), despite sufficient power to detect small effects. Discussion and Conclusions. We did not confirm that allelic variants involved in dopaminergic function play a role in creating individual differences in exercise behavior. A plea is made for large genome-wide association studies to unravel the genetic pathways that affect this health-enhancing behavior.
Within the Netherlands a national network of biobanks has been established (Biobanking and Biomolecular Research Infrastructure-Netherlands (BBMRI-NL)) as a national node of the European BBMRI. One of the aims of BBMRI-NL is to enrich biobanks with different types of molecular and phenotype data. Here, we describe the Genome of the Netherlands (GoNL), one of the projects within BBMRI-NL. GoNL is a whole-genome-sequencing project in a representative sample consisting of 250 trio-families from all provinces in the Netherlands, which aims to characterize DNA sequence variation in the Dutch population. The parent–offspring trios include adult individuals ranging in age from 19 to 87 years (mean=53 years; SD=16 years) from birth cohorts 1910–1994. Sequencing was done on blood-derived DNA from uncultured cells and accomplished coverage was 14–15x. The family-based design represents a unique resource to assess the frequency of regional variants, accurately reconstruct haplotypes by family-based phasing, characterize short indels and complex structural variants, and establish the rate of de novo mutational events. GoNL will also serve as a reference panel for imputation in the available genome-wide association studies in Dutch and other cohorts to refine association signals and uncover population-specific variants. GoNL will create a catalog of human genetic variation in this sample that is uniquely characterized with respect to micro-geographic location and a wide range of phenotypes. The resource will be made available to the research and medical community to guide the interpretation of sequencing projects. The present paper summarizes the global characteristics of the project.
whole-genome sequence; trio-design; population genetics
Genomes of men and women differ in only a limited number of genes located on the sex chromosomes, whereas the transcriptome is far more sex-specific. Identification of sex-biased gene expression will contribute to understanding the molecular basis of sex-differences in complex traits and common diseases.
Sex differences in the human peripheral blood transcriptome were characterized using microarrays in 5,241 subjects, accounting for menopause status and hormonal contraceptive use. Sex-specific expression was observed for 582 autosomal genes, of which 57.7% was upregulated in women (female-biased genes). Female-biased genes were enriched for several immune system GO categories, genes linked to rheumatoid arthritis (16%) and genes regulated by estrogen (18%). Male-biased genes were enriched for genes linked to renal cancer (9%). Sex-differences in gene expression were smaller in postmenopausal women, larger in women using hormonal contraceptives and not caused by sex-specific eQTLs, confirming the role of estrogen in regulating sex-biased genes.
This study indicates that sex-bias in gene expression is extensive and may underlie sex-differences in the prevalence of common diseases.
The Enhancing NeuroImaging Genetics through Meta-Analysis (ENIGMA) Consortium is a collaborative network of researchers working together on a range of large-scale studies that integrate data from 70 institutions worldwide. Organized into Working Groups that tackle questions in neuroscience, genetics, and medicine, ENIGMA studies have analyzed neuroimaging data from over 12,826 subjects. In addition, data from 12,171 individuals were provided by the CHARGE consortium for replication of findings, in a total of 24,997 subjects. By meta-analyzing results from many sites, ENIGMA has detected factors that affect the brain that no individual site could detect on its own, and that require larger numbers of subjects than any individual neuroimaging study has currently collected. ENIGMA’s first project was a genome-wide association study identifying common variants in the genome associated with hippocampal volume or intracranial volume. Continuing work is exploring genetic associations with subcortical volumes (ENIGMA2) and white matter microstructure (ENIGMA-DTI). Working groups also focus on understanding how schizophrenia, bipolar illness, major depression and attention deficit/hyperactivity disorder (ADHD) affect the brain. We review the current progress of the ENIGMA Consortium, along with challenges and unexpected discoveries made on the way.
Genetics; MRI; GWAS; Consortium; Meta-analysis; Multi-site
bipolar disorder; major depressive disorder; genome-wide association study; meta-analysis
Genome-wide association (GWA) studies of psychiatric disorders have been criticized for their lack in explaining a considerable proportion of the heritability established in twin and family studies. GWA studies of Major Depressive Disorder (MDD) in particular have so far been unsuccessful in detecting genome-wide significant SNPs. Using two different recently proposed methods designed to estimate the heritability of a phenotype that is attributable to genome-wide SNPs, we show that SNPs on current platforms contain substantial information concerning the additive genetic variance of MDD. To assess the consistency of these two different methods we analyzed four other complex phenotypes from different domains. The pattern of results is consistent with estimates of heritability obtained in twin studies carried out in the same population.
Measures of personality and psychological distress are correlated and exhibit genetic covariance. We conducted univariate genome-wide SNP (~2.5 million) and gene-based association analyses of these traits and examined the overlap in results across traits, including a prediction analysis of mood states using genetic polygenic scores for personality. Measures of neuroticism, extraversion, and symptoms of anxiety, depression, and general psychological distress were collected in eight European cohorts (n ranged 546 to 1 338; maximum total n=6 268) whose mean age ranged from 55 to 79 years. Meta-analysis of the cohort results was performed, with follow-up associations of the top SNPs and genes investigated in independent cohorts (n=527 to 6 032). Suggestive association (P=8×10−8) of rs1079196 in the FHIT gene was observed with symptoms of anxiety. Other notable associations (P<6.09×10−6) included SNPs in five genes for neuroticism (LCE3C, POLR3A, LMAN1L, ULK3, SCAMP2), KIAA0802 for extraversion, and NOS1 for general psychological distress. An association between symptoms of depression and rs7582472 (near to MGAT5 and NCKAP5) was replicated in two independent samples, but other replication findings were less consistent. Gene-based tests identified a significant locus on chromosome 15 (spanning five genes) associated with neuroticism which replicated (P<0.05) in an independent cohort. Support for common genetic effects among personality and mood (particularly neuroticism and depressive symptoms) was found in terms of SNP association overlap and polygenic score prediction. The variance explained by individual SNPs was very small (up to 1%) confirming that there are no moderate/large effects of common SNPs on personality and related traits.
GWAS; extraversion; neuroticism; anxiety; depression
Genetic factors underlying trait neuroticism, reflecting a tendency towards negative affective states, may overlap genetic susceptibility for anxiety disorders and help explain the extensive comorbidity amongst internalizing disorders. Genome-wide linkage (GWL) data from several studies of neuroticism and anxiety disorders have been published, providing an opportunity to test such hypotheses and identify genomic regions that harbor genes common to these phenotypes. In all, 11 independent GWL studies of either neuroticism (n=8) or anxiety disorders (n=3) were collected, which comprised of 5341 families with 15 529 individuals. The rank-based genome scan meta-analysis (GSMA) approach was used to analyze each trait separately and combined, and global correlations between results were examined. False discovery rate (FDR) analysis was performed to test for enrichment of significant effects. Using 10 cM intervals, bins nominally significant for both GSMA statistics, PSR and POR, were found on chromosomes 9, 11, 12, and 14 for neuroticism and on chromosomes 1, 5, 15, and 16 for anxiety disorders. Genome-wide, the results for the two phenotypes were significantly correlated, and a combined analysis identified additional nominally significant bins. Although none reached genome-wide significance, an excess of significant PSRP-values were observed, with 12 bins falling under a FDR threshold of 0.50. As demonstrated by our identification of multiple, consistent signals across the genome, meta-analytically combining existing GWL data is a valuable approach to narrowing down regions relevant for anxiety-related phenotypes. This may prove useful for prioritizing emerging genome-wide association data for anxiety disorders.
anxiety; neuroticism; panic disorder; linkage; meta-analysis
Personality can be thought of as a set of characteristics that influence people’s thoughts, feelings, and behaviour across a variety of settings. Variation in personality is predictive of many outcomes in life, including mental health. Here we report on a meta-analysis of genome-wide association (GWA) data for personality in ten discovery samples (17 375 adults) and five in-silico replication samples (3 294 adults). All participants were of European ancestry. Personality scores for Neuroticism, Extraversion, Openness to Experience, Agreeableness, and Conscientiousness were based on the NEO Five-Factor Inventory. Genotype data were available of ~2.4M Single Nucleotide Polymorphisms (SNPs; directly typed and imputed using HAPMAP data). In the discovery samples, classical association analyses were performed under an additive model followed by meta-analysis using the weighted inverse variance method. Results showed genome-wide significance for Openness to Experience near the RASA1 gene on 5q14.3 (rs1477268 and rs2032794, P = 2.8 × 10−8 and 3.1 × 10−8) and for Conscientiousness in the brain-expressed KATNAL2 gene on 18q21.1 (rs2576037, P = 4.9 × 10−8). We further conducted a gene-based test that confirmed the association of KATNAL2 to Conscientiousness. In-silico replication did not, however, show significant associations of the top SNPs with Openness and Conscientiousness, although the direction of effect of the KATNAL2 SNP on Conscientiousness was consistent in all replication samples. Larger scale GWA studies and alternative approaches are required for confirmation of KATNAL2 as a novel gene affecting Conscientiousness.
Personality; Five-Factor Model; Genome-wide association; Meta-analysis; Genetic variants
A genome-wide association study of educational attainment was conducted in a discovery sample of 101,069 individuals and a replication sample of 25,490. Three independent SNPs are genome-wide significant (rs9320913, rs11584700, rs4851266), and all three replicate. Estimated effects sizes are small (R2 ≈ 0.02%), approximately 1 month of schooling per allele. A linear polygenic score from all measured SNPs accounts for ≈ 2% of the variance in both educational attainment and cognitive function. Genes in the region of the loci have previously been associated with health, cognitive, and central nervous system phenotypes, and bioinformatics analyses suggest the involvement of the anterior caudate nucleus. These findings provide promising candidate SNPs for follow-up work, and our effect size estimates can anchor power analyses in social-science genetics.
Recent studies have shown an association between cigarettes per day (CPD) and a nonsynonymous single-nucleotide polymorphism in CHRNA5, rs16969968.
To determine whether the association between rs16969968 and smoking is modified by age at onset of regular smoking.
Available genetic studies containing measures of CPD and the genotype of rs16969968 or its proxy.
Uniform statistical analysis scripts were run locally. Starting with 94 050 ever-smokers from 43 studies, we extracted the heavy smokers (CPD >20) and light smokers (CPD ≤10) with age-at-onset information, reducing the sample size to 33 348. Each study was stratified into early-onset smokers (age at onset ≤16 years) and late-onset smokers (age at onset >16 years), and a logistic regression of heavy vs light smoking with the rs16969968 genotype was computed for each stratum. Meta-analysis was performed within each age-at-onset stratum.
Individuals with 1 risk allele at rs16969968 who were early-onset smokers were significantly more likely to be heavy smokers in adulthood (odds ratio [OR]=1.45; 95% CI, 1.36–1.55; n=13 843) than were carriers of the risk allele who were late-onset smokers (OR = 1.27; 95% CI, 1.21–1.33, n = 19 505) (P = .01).
These results highlight an increased genetic vulnerability to smoking in early-onset smokers.