We propose a method to analyze family-based samples together with unrelated cases and controls. The method builds on the idea of matched case–control analysis using conditional logistic regression (CLR). For each trio within the family, a case (the proband) and matched pseudo-controls are constructed, based upon the transmitted and untransmitted alleles. Unrelated controls, matched by genetic ancestry, supplement the sample of pseudo-controls; likewise unrelated cases are also paired with genetically matched controls. Within each matched stratum, the case genotype is contrasted with control pseudo-control genotypes via CLR, using a method we call matched-CLR (mCLR). Eigenanalysis of numerous SNP genotypes provides a tool for mapping genetic ancestry. The result of such an analysis can be thought of as a multidimensional map, or eigenmap, in which the relative genetic similarities and differences amongst individuals is encoded in the map. Once constructed, new individuals can be projected onto the ancestry map based on their genotypes. Successful differentiation of individuals of distinct ancestry depends on having a diverse, yet representative sample from which to construct the ancestry map. Once samples are well-matched, mCLR yields comparable power to competing methods while ensuring excellent control over Type I error.
conditional logistic regression; family-based design; genome-wide association; matched case–control; population stratification
As one approach to uncovering the genetic underpinnings of complex disease, individuals are measured at a large number of genetic variants (usually SNPs) across the genome and these SNP genotypes are assessed for association with disease status. We propose a new statistical method called Spectral-GEM for the analysis of genome-wide association studies; the goal of Spectral-GEM is to quantify the ancestry of the sample from such genotypic data. Ignoring structure due to differential ancestry can lead to an excess of spurious findings and reduce power. Ancestry is commonly estimated using the eigenvectors derived from principal component analysis (PCA). To develop an alternative to PCA we draw on connections between multidimensional scaling and spectral graph theory. Our approach, based on a spectral embedding derived from the normalized Laplacian of a graph, can produce more meaningful delineation of ancestry than by using PCA. Often the results from Spectral-GEM are straightforward to interpret and therefore useful in association analysis. We illustrate the new algorithm with an analysis of the POPRES data [Nelson et al., 2008].
eigenanalysis; genome-wide association; principal component analysis; population structure
Corticobasal degeneration (CBD) is a neurodegenerative disorder affecting movement and cognition, definitively diagnosed only at autopsy. Here we conduct a GWAS in CBD cases (n = 152) and 3,311 controls, and 67 CBD cases and 439 controls in a replication stage. Associations with meta-analysis were 17q21 at MAPT (P = 1.42 × 10−12), 8p12 at lnc-KIF13B-1, a long non-coding RNA (rs643472; P = 3.41 × 10−8), and 2p22 at SOS1 (rs963731; P = 1.76 × 10−7). Testing for association of CBD with top PSP GWAS SNPs identified associations at MOBP (3p22; rs1768208; P = 2.07 × 10−7) and MAPT H1c (17q21; rs242557; P = 7.91 × 10−6). We previously reported SNP/transcript level associations with rs8070723/MAPT, rs242557/MAPT, and rs1768208/MOBP and herein identified association with rs963731/SOS1. We identify new CBD susceptibility loci and show that CBD and PSP share a genetic risk factor other than MAPT, at 3p22 MOBP (myelin-associated oligodendrocytic protein).
Corticobasal degeneration (CBD) is a neurodegenerative disorder affecting movement and cognition, definitively diagnosed only at autopsy. Here, we conduct a genome-wide association study (GWAS) in CBD cases (n=152) and 3,311 controls, and 67 CBD cases and 439 controls in a replication stage. Associations with meta-analysis were 17q21 at MAPT (P=1.42 × 10−12), 8p12 at lnc-KIF13B-1, a long non-coding RNA (rs643472; P=3.41 × 10−8), and 2p22 at SOS1 (rs963731; P=1.76 × 10−7). Testing for association of CBD with top progressive supranuclear palsy (PSP) GWAS single-nucleotide polymorphisms (SNPs) identified associations at MOBP (3p22; rs1768208; P=2.07 × 10−7) and MAPT H1c (17q21; rs242557; P=7.91 × 10−6). We previously reported SNP/transcript level associations with rs8070723/MAPT, rs242557/MAPT, and rs1768208/MOBP and herein identified association with rs963731/SOS1. We identify new CBD susceptibility loci and show that CBD and PSP share a genetic risk factor other than MAPT at 3p22 MOBP (myelin-associated oligodendrocyte basic protein).
Corticobasal degeneration is a rare neurodegenerative disorder that can only be definitively diagnosed by autopsy. Here, Kouri et al. conduct a genome-wide-association study and identify two genetic susceptibility loci 17q21 (MAPT) and 3p12 (MOBP), and a novel susceptibility locus at 8p12.
The genetic architecture of autism spectrum disorder involves the interplay of common and rare variation and their impact on hundreds of genes. Using exome sequencing, analysis of rare coding variation in 3,871 autism cases and 9,937 ancestry-matched or parental controls implicates 22 autosomal genes at a false discovery rate (FDR) < 0.05, and a set of 107 autosomal genes strongly enriched for those likely to affect risk (FDR < 0.30). These 107 genes, which show unusual evolutionary constraint against mutations, incur de novo loss-of-function mutations in over 5% of autistic subjects. Many of the genes implicated encode proteins for synaptic, transcriptional, and chromatin remodeling pathways. These include voltage-gated ion channels regulating propagation of action potentials, pacemaking, and excitability-transcription coupling, as well as histone-modifying enzymes and chromatin remodelers, prominently histone post-translational modifications involving lysine methylation/demethylation.
Recent studies implicate chromatin modifiers in autism spectrum disorder (ASD) through the identification of recurrent de novo loss of function mutations in affected individuals. ASD risk genes are co-expressed in human midfetal cortex, suggesting that ASD risk genes converge in specific regulatory networks during neurodevelopment. To elucidate such networks we identify genes targeted by CHD8, a chromodomain helicase strongly associated with ASD, in human midfetal brain, human neural stem cells (hNSCs) and embryonic mouse cortex. CHD8 targets are strongly enriched for other ASD risk genes in both human and mouse neurodevelopment, and converge in ASD-associated co-expression networks in human midfetal cortex. CHD8 knockdown in hNSCs results in dysregulation of ASD risk genes directly targeted by CHD8. Integration of CHD8 binding data into ASD risk models improves detection of risk genes. These results suggest loss of CHD8 contributes to ASD by perturbing an ancient gene regulatory network during human brain development.
Recent studies implicate chromatin modifiers in autism spectrum disorder (ASD) through the identification of recurrent de novo loss of function mutations in affected individuals. ASD risk genes are co-expressed in human midfetal cortex, suggesting that ASD risk genes converge in specific regulatory networks during neurodevelopment. To elucidate such networks, we identify genes targeted by CHD8, a chromodomain helicase strongly associated with ASD, in human midfetal brain, human neural stem cells (hNSCs) and embryonic mouse cortex. CHD8 targets are strongly enriched for other ASD risk genes in both human and mouse neurodevelopment, and converge in ASD-associated co-expression networks in human midfetal cortex. CHD8 knockdown in hNSCs results in dysregulation of ASD risk genes directly targeted by CHD8. Integration of CHD8-binding data into ASD risk models improves detection of risk genes. These results suggest loss of CHD8 contributes to ASD by perturbing an ancient gene regulatory network during human brain development.
Autism genes converge in midfetal cortical co-expression networks, and chromatin regulators such as CHD8 are increasingly associated with autism spectrum disorder (ASD). Here the authors map CHD8 targets in developing brain, and find that CHD8 directly regulates other ASD risk genes during human neurodevelopment.
Neurocognitive impairments in schizophrenia are well replicated and widely regarded as candidate endophenotypes that may facilitate understanding of schizophrenia genetics and pathophysiology. The Project Among African-Americans to Explore Risks for Schizophrenia (PAARTNERS) aims to identify genes underlying liability to schizophrenia. The unprecedented size of its study group (N=1,872), made possible through use of a computerized neurocognitive battery, can help further investigation of the genetics of neurocognition. The current analysis evaluated two characteristics not fully addressed in prior research: 1) heritability of neurocognition in African American families and 2) relationship between neurocognition and psychopathology in families of African American probands with schizophrenia or schizoaffective disorder.
Across eight data collection sites, patients with schizophrenia or schizoaffective disorder (N=610), their biological relatives (N=928), and community comparison subjects (N=334) completed a standardized diagnostic evaluation and the computerized neurocognitive battery. Performance accuracy and response time (speed) were measured separately for 10 neurocognitive domains.
The patients with schizophrenia or schizoaffective disorder exhibited less accuracy and speed in most neurocognitive domains than their relatives both with and without other psychiatric disorders, who in turn were more impaired than comparison subjects in most domains. Estimated trait heritability after inclusion of the mean effect of diagnostic status, age, and sex revealed significant heritabilities for most neurocognitive domains, with the highest for accuracy of abstraction/ flexibility, verbal memory, face memory, spatial processing, and emotion processing and for speed of attention.
Neurocognitive functions in African American families are heritable and associated with schizophrenia. They show potential for gene-mapping studies.
Spontaneously arising (‘de novo’) mutations play an important role in medical genetics. For diseases with extensive locus heterogeneity – such as autism spectrum disorders (ASDs) – the signal from de novo mutations (DNMs) is distributed across many genes, making it difficult to distinguish disease-relevant mutations from background variation. We provide a statistical framework for the analysis of DNM excesses per gene and gene set by calibrating a model of de novo mutation. We applied this framework to DNMs collected from 1,078 ASD trios and – while affirming a significant role for loss-of-function (LoF) mutations – found no excess of de novo LoF mutations in cases with IQ above 100, suggesting that the role of DNMs in ASD may reside in fundamental neurodevelopmental processes. We also used our model to identify ~1,000 genes that are significantly lacking functional coding variation in non-ASD samples and are enriched for de novo LoF mutations identified in ASD cases.
A key component of genetic architecture is the allelic spectrum influencing trait variability. For autism spectrum disorder (henceforth autism) the nature of its allelic spectrum is uncertain. Individual risk genes have been identified from rare variation, especially de novo mutations1–8. From this evidence one might conclude that rare variation dominates its allelic spectrum, yet recent studies show that common variation, individually of small effect, has substantial impact en masse9,10. At issue is how much of an impact relative to rare variation. Using a unique epidemiological sample from Sweden, novel methods that distinguish total narrow-sense heritability from that due to common variation, and by synthesizing results from other studies, we reach several conclusions about autism’s genetic architecture: its narrow-sense heritability is ≈54% and most traces to common variation; rare de novo mutations contribute substantially to individuals’ liability; still their contribution to variance in liability, 2.6%, is modest compared to heritable variation.
Whole-exome sequencing (WES) studies have demonstrated the contribution of de novo loss-of-function single nucleotide variants to autism spectrum disorders (ASD). However, challenges in the reliable detection of de novo insertions and deletions (indels) have limited inclusion of these variants in prior analyses. Through the application of a robust indel detection method to WES data from 787 ASD families (2,963 individuals), we demonstrate that de novo frameshift indels contribute to ASD risk (OR=1.6; 95%CI=1.0-2.7; p=0.03), are more common in female probands (p=0.02), are enriched among genes encoding FMRP targets (p=6×10−9), and arise predominantly on the paternal chromosome (p<0.001). Based on mutation rates in probands versus unaffected siblings, de novo frameshift indels contribute to risk in approximately 3.0% of individuals with ASD. Finally, through observing clustering of mutations in unrelated probands, we report two novel ASD-associated genes: KMT2E (MLL5), a chromatin regulator, and RIMS1, a regulator of synaptic vesicle release.
Autism spectrum disorder (ASD) is a complex developmental syndrome of unknown etiology. Recent studies employing exome- and genome-wide sequencing have identified nine high-confidence ASD (hcASD) genes. Working from the hypothesis that ASD-associated mutations in these biologically pleiotropic genes will disrupt intersecting developmental processes to contribute to a common phenotype, we have attempted to identify time periods, brain regions, and cell types in which these genes converge. We have constructed coexpression networks based on the hcASD “seed” genes, leveraging a rich expression data set encompassing multiple human brain regions across human development and into adulthood. By assessing enrichment of an independent set of probable ASD (pASD) genes, derived from the same sequencing studies, we demonstrate a key point of convergence in midfetal layer 5/6 cortical projection neurons. This approach informs when, where, and in what cell types mutations in these specific genes may be productively studied to clarify ASD pathophysiology.
Brain development follows a different trajectory in children with Autism Spectrum Disorders (ASD) than in typically developing children. A proxy for neurodevelopment could be head circumference (HC), but studies assessing HC and its clinical correlates in ASD have been inconsistent. This study investigates HC and clinical correlates in the Simons Simplex Collection cohort.
We used a mixed linear model to estimate effects of covariates and the deviation from the expected HC given parental HC (genetic deviation). After excluding individuals with incomplete data, 7225 individuals in 1891 families remained for analysis. We examined the relationship between HC/genetic deviation of HC and clinical parameters.
Gender, age, height, weight, genetic ancestry and ASD status were significant predictors of HC (estimate of the ASD effect=0.2cm). HC was approximately normally distributed in probands and unaffected relatives, with only a few outliers. Genetic deviation of HC was also normally distributed, consistent with a random sampling of parental genes. Whereas larger HC than expected was associated with ASD symptom severity and regression, IQ decreased with the absolute value of the genetic deviation of HC.
Measured against expected values derived from covariates of ASD subjects, statistical outliers for HC were uncommon. HC is a strongly heritable trait and population norms for HC would be far more accurate if covariates including genetic ancestry, height and age were taken into account. The association of diminishing IQ with absolute deviation from predicted HC values suggests HC could reflect subtle underlying brain development and warrants further investigation.
head circumference; body metrics; genetic ancestry; IQ; autism spectrum disorder; ASD
Two common sources of DNA for whole exome sequencing (WES) are whole blood (WB) and immortalized lymphoblastoid cell line (LCL). However, it is possible that LCLs have a substantially higher rate of mutation than WB, causing concern for their use in sequencing studies. We compared results from paired WB and LCL DNA samples for 16 subjects, using LCLs of low passage number (<5). Using a standard analysis pipeline we detected a large number of discordant genotype calls (approximately 50 per subject) that we segregated into categories of “confidence” based on read-level quality metrics. From these categories and validation by Sanger sequencing, we estimate that the vast majority of the candidate differences were false positives and that our categories were effective in predicting valid sequence differences, including LCLs with putative mosaicism for the non-reference allele (3–4 per exome). These results validate the use of DNA from LCLs of low passage number for exome sequencing.
graphical diagnostics; lymphoblastoid cell line; mosaicism; sequence variant call; strand bias; somatic mutation
The liability to addiction has been shown to be highly genetically correlated across drug classes, suggesting nondrug-specific mechanisms.
In 757 subjects, we performed association analysis between 1536 single nucleotide polymorphisms (SNPs) in 106 candidate genes and a drug use disorder diagnosis (DUD).
Associations (p ≤ .0008) were detected with three SNPs in the arginine vasopressin 1A receptor gene, AVPR1A, with a gene-wise p value of 3 × 10−5. Bioinformatic evidence points to a role for rs11174811 (microRNA binding site disruption) in AVPR1A function. Based on literature implicating AVPR1A in social bonding, we tested spousal satisfaction as a mediator of the association of rs11174811 with the DUD. Spousal satisfaction was significantly associated with DUD in males (p <.0001). The functional AVPR1A SNP, rs11174811, was associated with spousal satisfaction in males (p = .007). Spousal satisfaction was a significant mediator of the relationship between rs11174811 and DUD. We also present replication of the association in males between rs11174811 and substance use in one clinically ascertained (n = 1399) and one epidemiologic sample (n = 2231). The direction of the association is consistent across the clinically-ascertained samples but reversed in the epidemiologic sample. Lastly, we found a significant impact of rs11174811 genotype on AVPR1A expression in a postmortem brain sample.
The findings of this study call for expansion of research into the role of the arginine vasopressin and other neuropeptide system variation in DUD liability.
Addiction; alcoholism; gene systems; genetic association; social relationships; vasopressin
We previously reported genome-wide significant evidence for linkage between chromosome 6q and bipolar I disorder (BPI) by performing a meta-analysis of original genotype data from 11 genome scan linkage studies. We now present follow-up linkage disequilibrium mapping of the linked region utilizing 3,047 single nucleotide polymorphism (SNP) markers in a case–control sample (N = 530 cases, 534 controls) and family-based sample (N = 256 nuclear families, 1,301 individuals). The strongest single SNP result (rs6938431, P=6.72× 10−5) was observed in the case–control sample, near the solute carrier family 22, member 16 gene (SLC22A16). In a replication study, we genotyped 151 SNPs in an independent sample (N = 622 cases, 1,181 controls) and observed further evidence of association between variants at SLC22A16 and BPI. Although consistent evidence of association with any single variant was not seen across samples, SNP-wise and gene-based test results in the three samples provided convergent evidence for association with SLC22A16, a carnitine transporter, implicating this gene as a novel candidate for BPI risk. Further studies in larger samples are warranted to clarify which, if any, genes in the 6q region confer risk for bipolar disorder.
bipolar disorder; genetic; association; SLC22A16; 6q
De novo loss-of-function (dnLoF) mutations are found twofold more often in autism spectrum disorder (ASD) probands than their unaffected siblings. Multiple independent dnLoF mutations in the same gene implicate the gene in risk and hence provide a systematic, albeit arduous, path forward for ASD genetics. It is likely that using additional non-genetic data will enhance the ability to identify ASD genes.
To accelerate the search for ASD genes, we developed a novel algorithm, DAWN, to model two kinds of data: rare variations from exome sequencing and gene co-expression in the mid-fetal prefrontal and motor-somatosensory neocortex, a critical nexus for risk. The algorithm casts the ensemble data as a hidden Markov random field in which the graph structure is determined by gene co-expression and it combines these interrelationships with node-specific observations, namely gene identity, expression, genetic data and the estimated effect on risk.
Using currently available genetic data and a specific developmental time period for gene co-expression, DAWN identified 127 genes that plausibly affect risk, and a set of likely ASD subnetworks. Validation experiments making use of published targeted resequencing results demonstrate its efficacy in reliably predicting ASD genes. DAWN also successfully predicts known ASD genes, not included in the genetic data used to create the model.
Validation studies demonstrate that DAWN is effective in predicting ASD genes and subnetworks by leveraging genetic and gene expression data. The findings reported here implicate neurite extension and neuronal arborization as risks for ASD. Using DAWN on emerging ASD sequence data and gene expression data from other brain regions and tissues would likely identify novel ASD genes. DAWN can also be used for other complex disorders to identify genes and subnetworks in those disorders.
Autism; Risk prediction; Gene discovery; Weighted gene co-expression network analysis; Network; Hidden Markov random field; Neurite extension; Neuronal arborization
Given prior evidence for the contribution of rare copy number variations (CNVs) to autism spectrum disorders (ASD), we studied these events in 4,457 individuals from 1,174 simplex families, composed of parents, a proband and, in most kindreds, an unaffected sibling. We find significant association of ASD with de novo duplications of 7q11.23, where the reciprocal deletion causes Williams-Beuren syndrome, featuring a highly social personality. We identify rare recurrent de novo CNVs at five additional regions including two novel ASD loci, 16p13.2 (including the genes USP7 and C16orf72) and Cadherin13, and implement a rigorous new approach to evaluating the statistical significance of these observations. Overall, we find large de novo CNVs carry substantial risk (OR=3.55; CI =2.16-7.46, p=6.9 × 10−6); estimate the presence of 130-234 distinct ASD-related CNV intervals across the genome; and, based on data from multiple studies, present compelling evidence for the association of rare de novo events at 7q11.23, 15q11.2-13.1, 16p11.2, and Neurexin1.
Recent technological advances coupled with large sample sets have uncovered many factors underlying the genetic basis of traits and the predisposition to complex disease, but much is left to discover. A common thread to most genetic investigations is familial relationships. Close relatives can be identified from family records, and more distant relatives can be inferred from large panels of genetic markers. Unfortunately these empirical estimates can be noisy, especially regarding distant relatives. We propose a new method for denoising genetically—inferred relationship matrices by exploiting the underlying structure due to hierarchical groupings of correlated individuals. The approach, which we call Treelet Covariance Smoothing, employs a multiscale decomposition of covariance matrices to improve estimates of pairwise relationships. On both simulated and real data, we show that smoothing leads to better estimates of the relatedness amongst distantly related individuals. We illustrate our method with a large genome-wide association study and estimate the “heritability” of body mass index quite accurately. Traditionally heritability, defined as the fraction of the total trait variance attributable to additive genetic effects, is estimated from samples of closely related individuals using random effects models. We show that by using smoothed relationship matrices we can estimate heritability using population-based samples. Finally, while our methods have been developed for refining genetic relationship matrices and improving estimates of heritability, they have much broader potential application in statistics. Most notably, for error-in-variables random effects models and settings that require regularization of matrices with block or hierarchical structure.
Covariance estimation; cryptic relatedness; genome-wide association; heritability; kinship
To characterize the role of rare complete human knockouts in autism spectrum disorders (ASD), we identify genes with homozygous or compound heterozygous loss-of-function (LoF) variants (defined as nonsense and essential splice sites) from exome sequencing of 933 cases and 869 controls. We identify a two-fold increase in complete knockouts of autosomal genes with low rates of LoF variation (≤5% frequency) in cases and estimate a 3% contribution to ASD risk by these events, confirming this observation in an independent set of 563 probands and 4,605 controls. Outside the pseudo-autosomal regions on the X-chromosome, we similarly observe a significant 1.5-fold increase in rare hemizygous knockouts in males, contributing to another 2% of ASDs in males. Taken together these results provide compelling evidence that rare autosomal and X-chromosome complete gene knockouts are important inherited risk factors for ASD.
Genome-wide association studies (GWAS) implicate single nucleotide polymorphisms (SNPs) on chromosome 6p21.3-22.1, the human leukocyte antigen (HLA) region, as common risk factors for schizophrenia (SZ). Other studies implicate viral and protozoan exposure. Our study tests chromosome 6p SNPs for effects on SZ risk with and without exposure. Method: GWAS-significant SNPs and ancestry-informative marker SNPs were analyzed among African American patients with SZ (n = 604) and controls (n = 404). Exposure to herpes simplex virus, type 1 (HSV-1), cytomegalovirus (CMV), and Toxoplasma gondii (TOX) was assayed using specific antibody assays. Results: Five SNPs were nominally associated with SZ, adjusted for population admixture (P < .05, uncorrected for multiple comparisons). These SNPs were next analyzed in relation to infectious exposure. Multivariate analysis indicated significant association between rs3130297 genotype and HSV-1 exposure; the associated allele was different from the SZ risk allele. Conclusions: We propose a model for the genesis of SZ incorporating genomic variation in the HLA region and neurotropic viral exposure for testing in additional, independent African American samples.
HLA; gene; HSV-1; cytomegalovirus; schizophrenia; African American; kwd>
We evaluated the hypothesis that dopaminergic polymorphisms are risk factors for schizophrenia (SZ). In stage I, we screened 18 dopamine-related genes in two independent US Caucasian samples: 150 trios and 328 cases/501 controls. The most promising associations were detected with SLC6A3 (alias DAT), DRD3, COMT and SLC18A2 (alias VMAT2). In stage II, we comprehensively evaluated these four genes by genotyping 68 SNPs in all 478 cases and 501 controls from stage I. Fifteen (23.1%) significant associations were found (p ≤ 0.05). We sought epistasis between pairs of SNPs providing evidence of a main effect and observed 17 significant interactions (169 tests); 41.2% of significant interactions involved rs3756450 (5′ near promoter) or rs464049 (intron 4) at SLC6A3. In stage III, we confirmed our findings by genotyping 65 SNPs among 659 Bulgarian trios. Both SLC6A3 variants implicated in the US interactions were overtransmitted in this cohort (rs3756450, p = 0.035; rs464049, p = 0.011). Joint analyses from stages II and III identified associations at all four genes (pjoint < 0.05). We tested 29 putative interactions from stage II and detected replication between seven locus pairs (p ≤ 0.05). Simulations suggested our stage II and stage III interaction results were unlikely to have occurred by chance (p = 0.008 and 0.001, respectively). In stage IV we evaluasted rs464049 and rs3756450 for functional effects and found significant allele-specific differences at rs3756450 using electrophoretic mobility shift assays and dualluciferase promoter assays. Our data suggest that a network of dopaminergic polymorphisms increase risk for SZ.
Supported by National Institute of Mental Health (NIMH), this 12-site international collaboration seeks to identify genetic variants that affect risk for anorexia nervosa (AN).
Four hundred families will be ascertained with two or more individuals affected with AN. The assessment battery produces a rich set of phenotypes comprising eating disorder diagnoses and psychological and personality features known to be associated with vulnerability to eating disorders.
We report attributes of the first 200 families, comprising 200 probands and 232 affected relatives.
These results provide context for the genotyping of the first 200 families by the Center for Inherited Disease Research. We will analyze our first 200 families for linkage, complete recruitment of roughly 400 families, and then perform final linkage analyses on the complete cohort. DNA, genotypes, and phenotypes will form a national eating disorder repository maintained by NIMH and available to qualified investigators.
anorexia nervosa; eating disorders; bulimia nervosa; psychiatric disorders; genetics; linkage analysis; genomics
De novo mutations affect risk for many diseases and disorders, especially those with early-onset. An example is autism spectrum disorders (ASD). Four recent whole-exome sequencing (WES) studies of ASD families revealed a handful of novel risk genes, based on independent de novo loss-of-function (LoF) mutations falling in the same gene, and found that de novo LoF mutations occurred at a twofold higher rate than expected by chance. However successful these studies were, they used only a small fraction of the data, excluding other types of de novo mutations and inherited rare variants. Moreover, such analyses cannot readily incorporate data from case-control studies. An important research challenge in gene discovery, therefore, is to develop statistical methods that accommodate a broader class of rare variation. We develop methods that can incorporate WES data regarding de novo mutations, inherited variants present, and variants identified within cases and controls. TADA, for Transmission And De novo Association, integrates these data by a gene-based likelihood model involving parameters for allele frequencies and gene-specific penetrances. Inference is based on a Hierarchical Bayes strategy that borrows information across all genes to infer parameters that would be difficult to estimate for individual genes. In addition to theoretical development we validated TADA using realistic simulations mimicking rare, large-effect mutations affecting risk for ASD and show it has dramatically better power than other common methods of analysis. Thus TADA's integration of various kinds of WES data can be a highly effective means of identifying novel risk genes. Indeed, application of TADA to WES data from subjects with ASD and their families, as well as from a study of ASD subjects and controls, revealed several novel and promising ASD candidate genes with strong statistical support.
The genetic underpinnings of autism spectrum disorder (ASD) have proven difficult to determine, despite a wealth of evidence for genetic causes and ongoing effort to identify genes. Recently investigators sequenced the coding regions of the genomes from ASD children along with their unaffected parents (ASD trios) and identified numerous new candidate genes by pinpointing spontaneously occurring (de novo) mutations in the affected offspring. A gene with a severe (de novo) mutation observed in more than one individual is immediately implicated in ASD; however, the majority of severe mutations are observed only once per gene. These genes create a short list of candidates, and our results suggest about 50% are true risk genes. To strengthen our inferences, we develop a novel statistical method (TADA) that utilizes inherited variation transmitted to affected offspring in conjunction with (de novo) mutations to identify risk genes. Through simulations we show that TADA dramatically increases power. We apply this approach to nearly 1000 ASD trios and 2000 subjects from a case-control study and identify several promising genes. Through simulations and application we show that TADA's integration of sequencing data can be a highly effective means of identifying risk genes.
Progressive supranuclear palsy (PSP) is a neurodegenerative disorder pathologically characterized by intracellular tangles of hyperphosphorylated tau protein distributed throughout the neocortex, basal ganglia, and brainstem. A genome-wide association study identified EIF2AK3 as a risk factor for PSP. EIF2AK3 encodes PERK, part of the endoplasmic reticulum’s (ER) unfolded protein response (UPR). PERK is an ER membrane protein that senses unfolded protein accumulation within the ER lumen. Recently, several groups noted UPR activation in Alzheimer’s disease (AD), Parkinson’s disease (PD), amyotrophic lateral sclerosis, multiple system atrophy, and in the hippocampus and substantia nigra of PSP subjects. Here, we evaluate UPR PERK activation in the pons, medulla, midbrain, hippocampus, frontal cortex and cerebellum in subjects with PSP, AD, and in normal controls.
We found UPR activation primarily in disease-affected brain regions in both disorders. In PSP, the UPR was primarily activated in the pons and medulla and to a much lesser extent in the hippocampus. In AD, the UPR was extensively activated in the hippocampus. We also observed UPR activation in the hippocampus of some elderly normal controls, severity of which positively correlated with both age and tau pathology but not with Aβ plaque burden. Finally, we evaluated EIF2AK3 coding variants that influence PERK activation. We show that a haplotype associated with increased PERK activation is genetically associated with increased PSP risk.
The UPR is activated in disease affected regions in PSP and the genetic evidence shows that this activation increases risk for PSP and is not a protective response.
Progressive supranuclear palsy; PERK; Unfolded protein response; EIF2AK3; Alzheimer’s disease