Large multi-site image-analysis studies have successfully discovered genetic variants that affect brain structure in tens of thousands of subjects scanned worldwide. Candidate genes have also associated with brain integrity, measured using fractional anisotropy in diffusion tensor images (DTI). To evaluate the heritability and robustness of DTI measures as a target for genetic analysis, we compared 417 twins and siblings scanned on the same day on the same high field scanner (4-Tesla) with two protocols: (1) 94-directions; 2mm-thick slices, (2) 27-directions; 5mm-thickness. Using mean FA in white matter ROIs and FA ‘skeletons’ derived using FSL, we (1) examined differences in voxelwise means, variances, and correlations among the measures; and (2) assessed heritability with structural equation models, using the classical twin design. FA measures from the genu of the corpus callosum were highly heritable, regardless of protocol. Genome-wide analysis of the genu mean FA revealed differences across protocols in the top associations.
imaging genetics; DTI protocol stability; corpus callosum; genome-wide association study; multi-site analysis
Human brain connectivity is disrupted in a wide range of disorders – from Alzheimer’s disease to autism – but little is known about which specific genes affect it. Here we conducted a genome-wide association for connectivity matrices that capture information on the density of fiber connections between 70 brain regions. We scanned a large twin cohort (N=366) with 4-Tesla high angular resolution diffusion imaging (105-gradient HARDI). Using whole brain HARDI tractography, we extracted a relatively sparse 70×70 matrix representing fiber density between all pairs of cortical regions automatically labeled in co-registered anatomical scans. Additive genetic factors accounted for 1–58% of the variance in connectivity between 90 (of 122) tested nodes. We discovered genome-wide significant associations between variants and connectivity. GWAS permutations at various levels of heritability, and split-sample replication, validated our genetic findings. The resulting genes may offer new leads for mechanisms influencing aberrant connectivity and neurodegeneration.
genetics; high angular resolution diffusion imaging (HARDI); cortical surfaces; twin modeling; human connectome
A major challenge in neuroscience is finding which genes affect brain integrity, connectivity, and intellectual function. Discovering influential genes holds vast promise for neuroscience, but typical genome-wide searches assess around one million genetic variants one-by-one, leading to intractable false positive rates, even with vast samples of subjects. Even more intractable is the question of which genes interact and how they work together to affect brain connectivity. Here we report a novel approach that discovers which genes contribute to brain wiring and fiber integrity at all pairs of points in a brain scan. We studied genetic correlations between thousands of points in human brain images from 472 twins and their non-twin siblings (mean age: 23.7±2.1 SD years; 193 M/279 F). We combined clustering with genome-wide scanning to find brain systems with common genetic determination. We then filtered the image in a new way to boost power to find causal genes. Using network analysis, we found a network of genes that affect brain wiring in healthy young adults. Our new strategy makes it more computationally tractable to discover genes that affect brain integrity. The gene network showed small-world and scale-free topologies, suggesting efficiency in genetic interactions, and resilience to network disruption. Genetic variants at hubs of the network influence intellectual performance by modulating associations between performance intelligence quotient (IQ) and the integrity of major white matter tracts, such as the callosal genu and splenium, cingulum, optic radiations, and the superior longitudinal fasciculus.
imaging genetics; twins; white matter; diffusion imaging; intelligence quotient; scale-free network; small-world network
Common diseases such as endometriosis (ED), Alzheimer's disease (AD) and multiple sclerosis (MS) account for a significant proportion of the health care burden in many countries. Genome-wide association studies (GWASs) for these diseases have identified a number of individual genetic variants contributing to the risk of those diseases. However, the effect size for most variants is small and collectively the known variants explain only a small proportion of the estimated heritability. We used a linear mixed model to fit all single nucleotide polymorphisms (SNPs) simultaneously, and estimated genetic variances on the liability scale using SNPs from GWASs in unrelated individuals for these three diseases. For each of the three diseases, case and control samples were not all genotyped in the same laboratory. We demonstrate that a careful analysis can obtain robust estimates, but also that insufficient quality control (QC) of SNPs can lead to spurious results and that too stringent QC is likely to remove real genetic signals. Our estimates show that common SNPs on commercially available genotyping chips capture significant variation contributing to liability for all three diseases. The estimated proportion of total variation tagged by all SNPs was 0.26 (SE 0.04) for ED, 0.24 (SE 0.03) for AD and 0.30 (SE 0.03) for MS. Further, we partitioned the genetic variance explained into five categories by a minor allele frequency (MAF), by chromosomes and gene annotation. We provide strong evidence that a substantial proportion of variation in liability is explained by common SNPs, and thereby give insights into the genetic architecture of the diseases.
Serum butyrylcholinesterase (BCHE) activity is associated with obesity, blood pressure and biomarkers of cardiovascular and diabetes risk. We have conducted a genome-wide association scan to discover genetic variants affecting BCHE activity, and to clarify whether the associations between BCHE activity and cardiometabolic risk factors are caused by variation in BCHE or whether BCHE variation is secondary to the metabolic abnormalities. We measured serum BCHE in adolescents and adults from three cohorts of Australian twin and family studies. The genotypes from ∼2.4 million single-nucleotide polymorphisms (SNPs) were available in 8791 participants with BCHE measurements. We detected significant associations with BCHE activity at three independent groups of SNPs at the BCHE locus (P = 5.8 × 10−262, 7.8 × 10−47, 2.9 × 10−12) and at four other loci: RNPEP (P = 9.4 × 10−16), RAPH1-ABI2 (P = 4.1 × 10−18), UGT1A1 (P = 4.0 × 10−8) and an intergenic region on chromosome 8 (P = 1.4 × 10−8). These loci affecting BCHE activity were not associated with metabolic risk factors. On the other hand, SNPs in genes previously associated with metabolic risk had effects on BCHE activity more often than can be explained by chance. In particular, SNPs within FTO and GCKR were associated with BCHE activity, but their effects were partly mediated by body mass index and triglycerides, respectively. We conclude that variation in BCHE activity is due to multiple variants across the spectrum from uncommon/large effect to common/small effect, and partly results from (rather than causes) metabolic abnormalities.
Polysaccharide sidechains attached to proteins play important roles in cell–cell and receptor–ligand interactions. Variation in the carbohydrate component has been extensively studied for the iron transport protein transferrin, because serum levels of the transferrin isoforms asialotransferrin + disialotransferrin (carbohydrate-deficient transferrin, CDT) are used as biomarkers of excessive alcohol intake. We conducted a genome-wide association study to assess whether genetic factors affect CDT concentration in serum. CDT was measured in three population-based studies: one in Switzerland (CoLaus study, n = 5181) and two in Australia (n = 1509, n = 775). The first cohort was used as the discovery panel and the latter ones served as replication. Genome-wide single-nucleotide polymorphism (SNP) typing data were used to identify loci with significant associations with CDT as a percentage of total transferrin (CDT%). The top three SNPs in the discovery panel (rs2749097 near PGM1 on chromosome 1, and missense polymorphisms rs1049296, rs1799899 in TF on chromosome 3) were successfully replicated , yielding genome-wide significant combined association with CDT% (P = 1.9 × 10−9, 4 × 10−39, 5.5 × 10−43, respectively) and explain 5.8% of the variation in CDT%. These allelic effects are postulated to be caused by variation in availability of glucose-1-phosphate as a precursor of the glycan (PGM1), and variation in transferrin (TF) structure.
Population structure, including population stratification and cryptic relatedness, can cause spurious associations in genome-wide association studies (GWAS). Usually, the scaled median or mean test statistic for association calculated from multiple single-nucleotide-polymorphisms across the genome is used to assess such effects, and ‘genomic control' can be applied subsequently to adjust test statistics at individual loci by a genomic inflation factor. Published GWAS have clearly shown that there are many loci underlying genetic variation for a wide range of complex diseases and traits, implying that a substantial proportion of the genome should show inflation of the test statistic. Here, we show by theory, simulation and analysis of data that in the absence of population structure and other technical artefacts, but in the presence of polygenic inheritance, substantial genomic inflation is expected. Its magnitude depends on sample size, heritability, linkage disequilibrium structure and the number of causal variants. Our predictions are consistent with empirical observations on height in independent samples of ∼4000 and ∼133 000 individuals.
genome-wide association study; genomic inflation factor; polygenic inheritance
Endometriosis is a complex disease arising from the interplay between multiple genetic and environmental factors. The genetic variants potentially underlying the hereditary component of endometriosis have been widely investigated through hypothesis-driven candidate gene studies, an approach that generally has proven to be inherently difficult and problematic for a number of reasons. Recently, through major collaborative efforts in the endometriosis research field, hypothesis-free genome-wide approaches have started to provide new insights into potential pathways leading to development of endometriosis, as well as highlighting the phenotypic heterogeneity of the condition. This review summarizes the most recent studies investigating the genetic variation contributing to endometriosis, with a particular focus on genome-wide approaches, and discusses promising future directions of genetic research.
Endometriosis; Genetics; Heritability; Genetic epidemiology; Candidate gene; Genome-wide association study; Linkage study
To refine a previously reported linkage peak for endometriosis on chromosome 10q26, and conduct follow-up analyses and a fine-mapping association study across the region to identify new candidate genes for endometriosis.
Cases = 3,223 women with surgically confirmed endometriosis; Controls = 1,190 women without endometriosis and 7,060 population samples.
Analysis of 11,984 SNPs on chromosome 10.
Main outcome measure(s)
Allele frequency differences between cases and controls.
Linkage analyses on families grouped by endometriosis symptoms (primarily subfertility) provided increased evidence for linkage (logarithm of odds (LOD) score = 3.62) near a previously reported linkage peak. Three independent association signals were found at 96.59 Mb (rs11592737, P=4.9 × 10−4), 105.63 Mb (rs1253130, P=2.5 × 10−4) and 124.25 Mb (rs2250804, P=9.7 × 10−4). Analyses including only samples from linkage families supported the association at all three regions. However, only rs11592737 in the cytochrome P450 subfamily C (CYP2C19) gene was replicated in an independent sample of 2,079 cases and 7060 population controls.
The role of the CYP2C19 gene in conferring risk for endometriosis warrants further investigation.
Endometriosis; linkage; association; subfertility; CYP2C19
Processing speed is an important cognitive function that is compromised in psychiatric illness (e.g., schizophrenia, depression) and old age; it shares genetic background with complex cognition (e.g., working memory, reasoning). To find genes influencing speed we performed a genome-wide association scan in up to three cohorts: Brisbane (mean age 16 years; N = 1659); LBC1936 (mean age 70 years, N = 992); LBC1921 (mean age 82 years, N = 307), and; HBCS (mean age 64 years, N = 1080). Meta-analysis of the common measures highlighted various suggestively significant (p < 1.21 × 10−5) SNPs and plausible candidate genes (e.g., TRIB3). A biological pathways analysis of the speed factor identified two common pathways from the KEGG database (cell junction, focal adhesion) in two cohorts, while a pathway analysis linked to the GO database revealed common pathways across pairs of speed measures (e.g., receptor binding, cellular metabolic process). These highlighted genes and pathways will be able to inform future research, including results for psychiatric disease.
Information processing speed; Cognitive ability; Genes; Biological pathways
The reported interaction between the length polymorphism (5HTTLPR) in the serotonin transporter gene (SLC6A4) and stressful life events on depression has led to many attempts to replicate but with inconsistent results. This inconsistency may reflect, in part, small sample size and the unknown contribution of the long allele SNP, rs25531. Using a large twin sample of 3,243 individuals from 2,230 families aged 18–95 years (mean = 32.3, SD = 13.6) we investigate the interaction between 5HTTLPR (subtyped with SNP rs25531) and stressful events on risk of depression and suicidality using both ordinal regressions and item response theory analyses. Participants reported via mailed questionnaire (82% response rate) both stressful events in the preceeding 12 months and symptoms of depression. Stressful events were defined as “personal” (affecting the individual), or “network” (affecting close family or friends). One to 10 years later (mean = 4.2 years), participants completed a comprehensive clinical psychiatric telephone interview (83% response rate) which assessed DSM-IV major depression and ideation of suicidality. Self-reports of depression and an increase in depression/suicidality assessed by clinical interview are significantly associated with prior personal events (P <0.001) after controlling for age and sex. However, they are inconsistently associated with prior network events (ranging, ns to P <0.01) and are not significantly associated with any of the genotype main effects (5HTTLPR, 5HTTLPR + rs25531) or interactions (stress × genotype). We find no evidence to support the hypothesis of any 5HTTLPR genotype by stress interaction.
rs25531; serotonin transporter; life events; suicidality; item response theory
Genome-wide association studies followed by replication provide a powerful approach to map genetic risk factors for asthma. We sought to search for new variants associated with asthma and attempt to replicate the association with four loci reported previously (ORMDL3, PDE4D, DENND1B and IL1RL1). Genome-wide association analyses of individual single nucleotide polymorphisms (SNPs), rare copy number variants (CNVs) and overall CNV burden were carried out in 986 asthma cases and 1846 asthma-free controls from Australia. The most-associated locus in the SNP analysis was ORMDL3 (rs6503525, P=4.8 × 10−7). Five other loci were associated with P<10−5, most notably the chemokine CXC motif ligand 14 (CXCL14) gene (rs31263, P=7.8 × 10−6). We found no evidence for association with the specific risk variants reported recently for PDE4D, DENND1B and ILR1L1. However, a variant in IL1RL1 that is in low linkage disequilibrium with that reported previously was associated with asthma risk after accounting for all variants tested (rs10197862, gene wide P=0.01). This association replicated convincingly in an independent cohort (P=2.4 × 10−4). A 300-kb deletion on chromosome 17q21 was associated with asthma risk, but this did not reach experiment-wide significance. Asthma cases and controls had comparable CNV rates, length and number of genes affected by deletions or duplications. In conclusion, we confirm the association between asthma risk and variants in ORMDL3 and identify a novel risk variant in IL1RL1. Follow-up of the 17q21 deletion in larger cohorts is warranted.
whole-genome; gene; atopy; heterogeneity; structural; IKZF3
The caudate is a subcortical brain structure implicated in many common neurological and psychiatric disorders. To identify specific genes associated with variations in caudate volume, structural MRI and genome-wide genotypes were acquired from two large cohorts, the Alzheimer’s Disease NeuroImaging Initiative (ADNI; N=734) and the Brisbane Adolescent/Young Adult Longitudinal Twin Study (BLTS; N=464). In a preliminary analysis of heritability, around 90% of the variation in caudate volume was due to genetic factors. We then conducted genome-wide association to find common variants that contribute to this relatively high heritability. Replicated genetic association was found for the right caudate volume at SNP rs163030 in the ADNI discovery sample (P=2.36×10−6) and in the BLTS replication sample (P=0.012). This genetic variation accounted for 2.79% and 1.61% of the trait variance, respectively. The peak of association was found in and around two genes, WDR41 and PDE8B, involved in dopamine signaling and development. In addition, a previously identified mutation in PDE8B causes a rare autosomal-dominant type of striatal degeneration. Searching across both samples offers a rigorous way to screen for genes consistently influencing brain structure at different stages of life. Variants identified here may be relevant to common disorders affecting the caudate.
genome-wide association; dopamine; caudate; heritability; WDR41; PDE8B (3-6 needed)
Previous microarray analyses identified 22 microRNAs (miRNAs) differentially expressed in paired ectopic and eutopic endometrium of women with and without endometriosis. To investigate further the role of these miRNAs in women with endometriosis, we conducted an association study aiming to explore the relationship between endometriosis risk and single-nucleotide polymorphisms (SNPs) in miRNA target sites for these differentially expressed miRNAs. A panel of 102 SNPs in the predicted miRNA binding sites were evaluated for an endometriosis association study and an ingenuity pathway analysis was performed. Fourteen rare variants were identified in this study. We found SNP rs14647 in the Wolf–Hirschhorn syndrome candidate gene1 (WHSC1) 3′UTR (untranslated region) was associated with endometriosis-related infertility presenting an odds ratio of 12.2 (95% confidence interval = 2.4–60.7, P = 9.03 × 10−5). SNP haplotype AGG in the solute carrier family 22, member 23 (SLC22A23) 3′UTR was associated with endometriosis-related infertility and more severe disease. With the individual genotyping data, ingenuity pathways analysis identified the tumour necrosis factor and cyclin-dependant kinase inhibitor as major factors in the molecular pathways. Significant associations between WHSC1 alleles and endometriosis-related infertility and SLC22A23 haplotypes and the disease severe stage were identified. These findings may help focus future research on subphenotypes of this disease. Replication studies in independent large sample sets to confirm and characterize the involvement of the gene variation in the pathogenesis of endometriosis are needed.
miRNA; endometriosis; single-nucleotide polymorphism; haplotype
An evolving hypothesis postulates that melanomas may arise through “naevus-associated” and “chronic sun exposure” pathways. We explored this hypothesis by examining associations between naevus-associated loci and melanoma risk across strata of body site and histological subtype. We genotyped 1028 invasive case patients and 1469 controls for variants in MTAP, PLA2G6, and IRF4, and compared allelic frequencies globally and by anatomical site and histological subtype of melanoma. Odds-ratios (ORs) and 95% confidence intervals (CIs) were calculated using classical and multinomial logistic regression models. Among controls, MTAP rs10757257, PLA2G6 rs132985 and IRF4 rs12203592 were the variants most significantly associated with number of naevi. In adjusted models, a significant association was found between MTAP rs10757257 and overall melanoma risk (OR=1.32, 95% CI=1.14–1.53), with no evidence of heterogeneity across sites (Phomogeneity=0.52). In contrast, MTAP rs10757257 was associated with superficial spreading/nodular melanoma (OR=1.34, 95% CI=1.15–1.57), but not with lentigo maligna melanoma (OR=0.79, 95% CI=0.46–1.35) (Phomogeneity=0.06), the subtype associated with chronic sun exposure. Melanoma was significantly inversely associated with rs12203592 in children (OR=0.35, 95% CI=0.16–0.77) and adolescents (OR=0.61, 95% CI=0.42–0.91), but not in adults (Phomogeneity=0.0008). Our results suggest that the relationship between MTAP and melanoma is subtype-specific, and that the association between IRF4 and melanoma is more evident for cases with a younger age at onset. These findings lend some support to the “divergent pathways” hypothesis and may provide at least one candidate gene underlying this model. Further studies are warranted to confirm these findings and improve our understanding of these relationships.
cutaneous melanoma; epidemiology; genes; naevi; polymorphisms
Human height and body mass index are influenced by a large number of genes, each with small effects, along with environment. To identify common genetic variants associated with these traits, we performed genome-wide association studies in 11,536 individuals composed of Australian twins, family members, and unrelated individuals at ~550,000 genotyped SNPs. We identified a single genome-wide significant variant for height (P value = 1.06 × 10-9) located in HHIP, a well-replicated height-associated gene. Suggestive levels of association were found for other known genes associated with height (P values < 1 × 10-6): ADAMTSL3, EFEMP1, GPR126, and HMGA2; and BMI (P values < 1 × 10-4): FTO and MC4R. Together, these variants explain less than 2% of total phenotypic variation for height and 0.5% for BMI.
Single nucleotide polymorphisms (SNPs) discovered by genome-wide association studies (GWASs) account for only a small fraction of the genetic variation of complex traits in human populations. Where is the remaining heritability? We estimated the proportion of variance for human height explained by 294,831 SNPs genotyped on 3,925 unrelated individuals using a linear model analysis, and validated the estimation method by simulations based upon the observed genotype data. We show that 45% of variance can be explained by considering all SNPs simultaneously. Thus, most of the heritability is not missing but has not previously been detected because the individual effects are too small to pass stringent significance tests. We provide evidence that the remaining heritability is due to incomplete linkage disequilibrium (LD) between causal variants and genotyped SNPs, exacerbated by causal variants having lower minor allele frequency (MAF) than the SNPs explored to date.
We have previously identified suggestive linkage for alcohol consumption in a community based sample of Australian adults. In this companion paper, we explore the strength of genetic linkage signals for alcohol dependence symptoms.
An alcohol dependence symptom score, based on DSM-IIIR and DSM-IV criteria, was examined. Twins and their non-twin siblings (1654 males, 2518 females), aged 21–81 years, were interviewed, with 803 individuals interviewed on two occasions, approximately 10 years apart. Linkage analyses were conducted on datasets compiled to maximize data collected at either the younger or the older age. In addition, linkage was compared between full samples and truncated samples that excluded the lightest drinkers (approximately 10% of the sample).
Suggestive peaks on chromosome 5p (LODs > 2.2) were found in a region previously identified in alcohol linkage studies using clinical populations. Linkage signal strength was found to vary between full and truncated samples and when samples differed only on the collection age for a sample subset.
The results support the finding that large community samples can be informative in the study of alcohol-related traits.
alcohol dependence symptoms; genetic linkage analysis; community sample
Variation in personality traits is 30% to 60% attributed to genetic influences. Attempts to unravel these genetic influences at the molecular level have, so far, been inconclusive. We performed the first genome-wide association study of Cloninger’s temperament scales in a sample of 5117 individuals, in order to identify common genetic variants underlying variation in personality. Participants’ scores on Harm Avoidance, Novelty Seeking, Reward Dependence, and Persistence were tested for association with 1,252,387 genetic markers. We also performed gene-based association tests and biological pathway analyses. No genetic variants that significantly contribute to personality variation were identified, while our sample provides over 90% power to detect variants that explain only 1% of the trait variance. This indicates that individual common genetic variants of this size or greater do not contribute to personality trait variation, which has important implications regarding the genetic architecture of personality and the evolutionary mechanisms by which heritable variation is maintained.
genome-wide association; genes; personality; temperament; mutation; selection; maintenance of genetic variation; evolution
Genome-wide association studies (GWAS) have become a major strategy for genetic dissection of human complex diseases. Analysing multiple phenotypes jointly may improve both our ability to detect genetic variants with multiple effects and our understanding of their common features. Allelic associations for multiple biochemical traits (serum alanine aminotransferase, aspartate aminotransferase, butrylycholinesterase (BCHE), C-reactive protein (CRP), ferritin, gamma glutamyltransferase (GGT), glucose, high-density lipoprotein cholesterol (HDL), insulin, low-density lipoprotein cholesterol (LDL), triglycerides and uric acid), and body-mass index, were examined.
We aimed to identify common genetic variants affecting more than one of these traits using genome-wide association analysis in 2548 adolescents and 9145 adults from 4986 Australian twin families. Multivariate and univariate associations were performed.
Multivariate analyses identified eight loci, and univariate association analyses confirmed two loci influencing more than one trait at p < 5 × 10-8. These are located on chromosome 8 (LPL gene affecting HDL and triglycerides) and chromosome 19 (TOMM40/APOE-C1-C2-C4 gene cluster affecting LDL and CRP). A locus on chromosome 12 (OASL gene) showed effects on GGT, LDL and CRP. The loci on chromosomes 12 and 19 unexpectedly affected LDL cholesterol and CRP in opposite directions.
We identified three possible loci that may affect multiple traits and validated 17 previously-reported loci. Our study demonstrated the usefulness of examining multiple phenotypes jointly and highlights an anomalous effect on CRP, which is increasingly recognised as a marker of cardiovascular risk as well as of inflammation.