We have performed the first whole genome analysis of neuroticism, using pooled DNA. We show that, by using eight pools with five replicates of each, we retain between 21 and 94% (depending on the effect size and threshold used) of the power from genotyping all 452 574 SNPs individually, equivalent to a 50-fold saving in the costs. We identified one SNP rs702543 from an analysis of ~2000 individuals selected from the neuroticism extremes of 88 142 people, which we were able to replicate in a separate sample from the same cohort. The SNP was genotyped in three other laboratories, on related, but not identical, phenotypes. In each case the direction of allelic effects was the same as in our sample, but the test reached statistical significance in only one sample, that is, only in one method of analysis (family based analysis). Our study raises a number of issues for whole genome association studies in general and the genetic basis of neuroticism in particular.
We deal first with the performance of the pooling strategy. Our estimates of the power of our DNA pooling strategy were derived by simulation using the measurement error across all SNPs from both 500 and 100K arrays. However, there is a 3.2-fold difference between the variances of the repeated measurements for the two types of arrays (mean variance for 100K= 0.0017, and for 500K= 0.0055). This difference in measurement accuracy is probably due to the smaller number of features per SNP and the reduction in feature size in the 500K array relative to the 100 K.21
It means that the measurements in the 500K arrays should be at least triplicated to achieve the genotype accuracy obtained with the 100K arrays.
DNA pooling allows substantial savings in genotyping costs (see, for example, references44–46
) and there have been successes using DNA pooling in the analysis of a relatively small number of markers, for example, in schizophrenia47,48
and serum high-density lipoprotein cholesterol levels.50,51
However, in the two cases where whole genome association has been attempted (that is, in memory52
and mild mental impairment53
), the yields have been low, that is, in both cases only one locus was considered significant.
Our results are comparable, in that we have found just one locus of small effect. Failure to replicate the finding in all samples is not unexpected, given the differences between the cohorts, in both phenotype and recruitment. No sample used exactly the same phenotype. We used the 23-item Eysenck N scale, while the US and Australian sample used the short (12-item) scale and the Dutch sample used the Amsterdamse Biografische Vragenlijst, which is based on the Eysenck scale but is not identical. However, differences in the measures are likely to be outweighed by differences in the recruitment strategy. Our sample represents the extremes of about 61 367 unrelated people, the Dutch from 3444 families, the Australian from around 7500 families and the US from 9270 twins.
Second, we consider the lessons for the design of whole genome association studies. Assuming that the genetic basis of neuroticism is typical of behavior, and also typical of other complex traits, our results reinforce the findings that the ORs of most complex trait loci are less than 1.6.54
Our linkage study on the same cohort failed to find any large effect loci (explaining more than 5% of the variance), and given that we had approximately 50% power to detect an OR of 1.5 (which is equivalent to a locus explaining 0.5–1% of the variance) and failed to find any loci accounting for more than 1% of the variance, it seems likely that the 40% additive genetic variance of neuroticism arises from many loci explaining much less than 1%. This means that to obtain adequate power in whole genome association studies, much larger samples will be needed than anticipated.
Using association analysis, we failed to identify any genes under the five significant linkage peaks that were obtained using the sib pairs from the same population-based study of personality (although it should be noted that there was a nominally significant enrichment in log P > 2 values in the 1-LOD interval on chromosome 1q). Our significant SNP is located on chromosome 5, a region that was not indicated by the linkage analysis. Since linkage analysis is robust to allelic heterogeneity at a locus, this may indicate that rare variants are a major contributor to the heritability of neuroticism. The linkage signal could also be due to the co-localization under the linkage peaks of variants in different genes, each of too insufficient individual effect to be detected by association. Other possible factors that could contribute to the lower power of our study to identify the genes under the linkage peaks include the effect of incomplete genome coverage (low LD between the SNPs typed and the causal variants) and a high false-negative rate of the association analysis.
Unlike the linkage intervals that did not show a significant enrichment of high log P
values, we did see the enrichment in known CNVs. Chromosomal abnormalities have been reported in patients with schizophrenia, bipolar disorder and major depression,55,56
but it is not clear to what extent CNVs contribute to variation in behavior. It is noteworthy that CNVs on chromosome 17 (enriched for high log P
values) contain two genes involved in behavior, that is, the Tau gene (MAPT) and the corticotropin-releasing hormone receptor-1 (CRHR1). CRHR1 has been shown to mediate anxiety-related behavior and hormonal adaptation to stress.57
Finally, our findings are important for the understanding of the genetic basis of neuroticism and the associated psychiatric disorders of anxiety and depression. The rs702543 SNP is located in the phosphodiesterase 4D, cAMP-specific (PDE4D) gene, in an intron between exon D3 and D8. The HapMap database (CEU, Release #21) shows one SNP (rs702542) in complete correlation with rs702543, and two other SNPs that are partially correlated, that is, rs296410 (r2
= 0.5) and rs40216 (r2
= 0.56). All three SNPs lie within a 65.3-kb haplotype block on chromosome 5 (58 872 707–58 937 990 bp). The G allele of rs702543 creates a putative cAMP response element (CRE) for CREB2/c-jun heterodimer (TGACGTTA), while the A allele destroys it.58,59
The transcription of different isoforms of PDE4D was shown to be regulated by cAMP levels. For example, a CRE (TGACGTT) in the promoter of an isoform of PDE4D (PDE4D5) was shown to be involved in the cAMP responsiveness of the PDE4D5 promoter.60
Several other lines of evidence support the involvement of the PDE4D gene in susceptibility to neuroticism and depression. A PDE4-specific inhibitor rolipram has antidepressant effects on animals and patients with major depression.61–63
PDE4D knockout mice show antidepressant-like behavior, which is further increased by the antidepressants desipramine and fluoxetine but not by rolipram. This suggests that the PDE4D subtype is an essential mediator of the antidepressant effects of rolipram. PDE4D expression is increased in mouse cerebral cortex by the repeated treatment with desipramine, fluoxetine and rolipram and in the hippocampus by fluoxetine and rolipram. Recently, variants in two genes encoding PDEs were found to be associated with the diagnosis of major depression, and one of them (PDE11A) was also found to be associated with remission in response to antidepressants.64
Together, these observations indicate that the PDE4D gene is likely to be involved in susceptibility to neuroticism and the associated psychiatric disorders of major depression and anxiety.