Schizophrenia is a highly heritable disorder. Genetic risk is conferred by a large number of alleles, including common alleles of small effect that might be detected by genome-wide association studies. Here, we report a multi-stage schizophrenia genome-wide association study of up to 36,989 cases and 113,075 controls. We identify 128 independent associations spanning 108 conservatively defined loci that meet genome-wide significance, 83 of which have not been previously reported. Associations were enriched among genes expressed in brain providing biological plausibility for the findings. Many findings have the potential to provide entirely novel insights into aetiology, but associations at DRD2 and multiple genes involved in glutamatergic neurotransmission highlight molecules of known and potential therapeutic relevance to schizophrenia, and are consistent with leading pathophysiological hypotheses. Independent of genes expressed in brain, associations were enriched among genes expressed in tissues that play important roles in immunity, providing support for the hypothesized link between the immune system and schizophrenia.
Schizophrenia is a highly heritable neuropsychiatric disorder of complex genetic etiology. Previous genome-wide surveys have revealed a greater burden of large, rare CNVs in schizophrenia cases and identified multiple rare recurrent CNVs that increase risk of schizophrenia although with incomplete penetrance and pleiotropic effects. Identification of additional recurrent CNVs and biological pathways enriched for schizophrenia CNVs requires greater sample sizes. We conducted a genome-wide survey for CNVs associated with schizophrenia using a Swedish national sample (4,719 cases and 5,917 controls). High-confidence CNV calls were generated using genotyping array intensity data and their effect on risk of schizophrenia was measured. Our data confirm increased burden of large, rare CNVs in schizophrenia cases as well as significant associations for recurrent 16p11.2 duplications, 22q11.2 deletions and 3q29 deletions. We report a novel association for 17q12 duplications (odds ratio=4.16, P=0.018), previously associated with autism and mental retardation but not schizophrenia. Intriguingly, gene set association analyses implicate biological pathways previously associated with schizophrenia through common variation and exome sequencing (calcium channel signaling and binding partners of the fragile X mental retardation protein). We found significantly increased burden of the largest CNVs (>500Kb) in genes present in the post-synaptic density, in genomic regions implicated via schizophrenia genome-wide association studies, and in gene products localized to mitochondria and cytoplasm. Our findings suggest that multiple lines of genomic inquiry – genome-wide screens for CNVs, common variation, and exonic variation – are converging on similar sets of pathways and/or genes.
schizophrenia; genetics; genomics; copy number variation; structural variation
Inherited alleles account for most of the genetic risk for schizophrenia. However, new (de novo) mutations, in the form of large chromosomal copy number changes, occur in a small fraction of cases and disproportionally disrupt genes encoding postsynaptic proteins. Here, we show that small de novo mutations, affecting one or a few nucleotides, are overrepresented among glutamatergic postsynaptic proteins comprising activity-regulated cytoskeleton-associated protein (ARC) and N-methyl-D-aspartate receptor (NMDAR) complexes. Mutations are additionally enriched in proteins that interact with these complexes to modulate synaptic strength, namely proteins regulating actin filament dynamics and those whose mRNAs are targets of fragile X mental retardation protein (FMRP). Genes affected by mutations in schizophrenia overlap those mutated in autism and intellectual disability, as do mutation-enriched synaptic pathways. Aligning our findings with a parallel case-control study, we demonstrate reproducible insights into aetiological mechanisms for schizophrenia and reveal pathophysiology shared with other neurodevelopmental disorders.
Several recurrent copy number variants (CNVs) have been shown to increase the risk of developing schizophrenia (SCZ), developmental delay (DD), autism spectrum disorders (ASD) and various congenital malformations (CM). Their penetrance for SCZ has been estimated to be modest. However, comparisons between their penetrance for SCZ or DD/ASD/CM, or estimates of the total penetrance for any of these disorders have not been made yet.
We use data from the largest available studies on SCZ and DD/ASD/CM, including a new sample of 6882 cases and 6316 controls, to estimate the frequencies of 70 implicated CNVs, in carriers with these disorders, in healthy controls and in the general population. On the basis of these frequencies we estimate their penetrance. We also estimate the strength of the selection pressure against CNVs and correlate this against their overall penetrance.
The rates of nearly all CNVs are higher in DD/ASD/CM, compared to SCZ. The penetrance of CNVs is at least several times higher for the development of a disorder from the group of DD/ASD/CM. The overall penetrance of SCZ-associated CNVs for developing any disorder is high, ranging between 10.6% and 100%.
CNVs associated with SCZ have high pathogenicity. The majority of the increased risk conferred by CNVs is towards the development of an earlier-onset disorder, such as DD/ASD/CM, rather than SCZ. The penetrance of CNVs correlates strongly with their selection coefficients. The improved estimates of penetrance will provide crucial information for genetic counselling.
CNV; schizophrenia; penetrance; developmental delay; autism spectrum disorder; selection
Advances in genome analysis, accompanied by the assembly of large patient cohorts, have made possible successful genetic analyses of polygenic brain disorders. If the resulting molecular clues, previously hidden in the genomes of affected individuals, are to yield useful information about pathogenesis and inform the discovery of new treatments, neurobiology will have to rise to many difficult challenges. Here we review the underlying logic of the genetic investigations, describe in more detail progress in schizophrenia and autism, and outline the challenges for neurobiology that lie ahead. We argue that technologies at the disposal of neuroscience are adequately advanced to begin to study the biology of common and devastating polygenic disorders.
By analyzing the exome sequences of 2,536 schizophrenia cases and 2,543 controls, we have demonstrated a polygenic burden primarily arising from rare (<1/10,000), disruptive mutations distributed across many genes. Especially enriched genesets included the voltage-gated calcium ion channel and the signaling complex formed by the activity-regulated cytoskeleton-associated (ARC) scaffold protein of the postsynaptic density (PSD), sets previously implicated by genome-wide association studies (GWAS) and copy-number variation (CNV) studies. Similar to reports in autism, targets of the fragile × mental retardation protein (FMRP, product of FMR1) were enriched for case mutations. No individual gene-based test achieved significance after correction for multiple testing and we did not detect any alleles of moderately low frequency (~0.5-1%) and moderately large effect. Taken together, these data suggest that population-based exome sequencing can discover risk alleles and complements established gene mapping paradigms in neuropsychiatric disease.
An increased rate of de novo copy number variants (CNVs) has been found in schizophrenia (SZ), autism and developmental delay. An increased rate has also been reported in bipolar affective disorder (BD). Here, in a larger BD sample, we aimed to replicate these findings and compare de novo CNVs between SZ and BD. We used Illumina microarrays to genotype 368 BD probands, 76 SZ probands and all their parents. Copy number variants were called by PennCNV and filtered for frequency (<1%) and size (>10 kb). Putative de novo CNVs were validated with the z-score algorithm, manual inspection of log R ratios (LRR) and qPCR probes. We found 15 de novo CNVs in BD (4.1% rate) and 6 in SZ (7.9% rate). Combining results with previous studies and using a cut-off of >100 kb, the rate of de novo CNVs in BD was intermediate between controls and SZ: 1.5% in controls, 2.2% in BD and 4.3% in SZ. Only the differences between SZ and BD and SZ and controls were significant. The median size of de novo CNVs in BD (448 kb) was also intermediate between SZ (613 kb) and controls (338 kb), but only the comparison between SZ and controls was significant. Only one de novo CNV in BD was in a confirmed SZ locus (16p11.2). Sporadic or early onset cases were not more likely to have de novo CNVs. We conclude that de novo CNVs play a smaller role in BD compared with SZ. Patients with a positive family history can also harbour de novo mutations.
Laboratory red blood cell (RBC) measurements are clinically important, heritable and differ among ethnic groups. To identify genetic variants that contribute to RBC phenotypes in African Americans (AAs), we conducted a genome-wide association study in up to ∼16 500 AAs. The alpha-globin locus on chromosome 16pter [lead SNP rs13335629 in ITFG3 gene; P < 1E−13 for hemoglobin (Hgb), RBC count, mean corpuscular volume (MCV), MCH and MCHC] and the G6PD locus on Xq28 [lead SNP rs1050828; P < 1E − 13 for Hgb, hematocrit (Hct), MCV, RBC count and red cell distribution width (RDW)] were each associated with multiple RBC traits. At the alpha-globin region, both the common African 3.7 kb deletion and common single nucleotide polymorphisms (SNPs) appear to contribute independently to RBC phenotypes among AAs. In the 2p21 region, we identified a novel variant of PRKCE distinctly associated with Hct in AAs. In a genome-wide admixture mapping scan, local European ancestry at the 6p22 region containing HFE and LRRC16A was associated with higher Hgb. LRRC16A has been previously associated with the platelet count and mean platelet volume in AAs, but not with Hgb. Finally, we extended to AAs the findings of association of erythrocyte traits with several loci previously reported in Europeans and/or Asians, including CD164 and HBS1L-MYB. In summary, this large-scale genome-wide analysis in AAs has extended the importance of several RBC-associated genetic loci to AAs and identified allelic heterogeneity and pleiotropy at several previously known genetic loci associated with blood cell traits in AAs.
Elevated resting heart rate is associated with greater risk of cardiovascular disease and mortality. In a 2-stage meta-analysis of genome-wide association studies in up to 181,171 individuals, we identified 14 new loci associated with heart rate and confirmed associations with all 7 previously established loci. Experimental downregulation of gene expression in Drosophila melanogaster and Danio rerio identified 20 genes at 11 loci that are relevant for heart rate regulation and highlight a role for genes involved in signal transmission, embryonic cardiac development and the pathophysiology of dilated cardiomyopathy, congenital heart failure and/or sudden cardiac death. In addition, genetic susceptibility to increased heart rate is associated with altered cardiac conduction and reduced risk of sick sinus syndrome, and both heart rate–increasing and heart rate–decreasing variants associate with risk of atrial fibrillation. Our findings provide fresh insights into the mechanisms regulating heart rate and identify new therapeutic targets.
Structurally complex genomic regions are not yet understood. One such locus, human 17q21.31, contains a megabase-long inversion polymorphism1, many uncharacterized copynumber variations (CNVs), and markers that associate with female fertility1, female meiotic recombination1–3, and neurological disease4,5. Additionally, the inverted H2 form of 17q21.31 appears to be positively selected in Europeans1. We developed a population-genetic approach to reveal complex genome structures and identified nine segregating structural forms of 17q21.31. Both the H1 and H2 forms of the 17q21.31 inversion polymorphism contain independently derived, partial duplications of the KANSL1 (KIAA1267) gene; these duplications, which produce novel KANSL1 transcripts, have both recently risen to high allele frequencies (26% and 19%) in Europeans. An older H2 form, lacking such a duplication, is present at low frequency in Europeans and Central African hunter-gatherer populations. We show that complex genome structures can be analyzed by imputation from SNPs.
Although copy number variants (CNVs) are important in genomic medicine, CNVs have not been systematically assessed for many complex traits. Several large rare CNVs increase risk for schizophrenia (SCZ) and autism and often demonstrate pleiotropic effects; however, their frequencies in the general population and other complex traits are unknown. Genotyping large numbers of samples is essential for progress. Large cohorts from many different diseases are being genotyped using exome-focused arrays designed to detect uncommon or rare protein-altering sequence variation. Although these arrays were not designed for CNV detection, the hybridization intensity data generated in each experiment could, in principle, be used for gene-focused CNV analysis. Our goal was to evaluate the extent to which CNVs can be detected using data from one particular exome array (the Illumina Human Exome Bead Chip). We genotyped 9, 100 Swedish subjects (3, 962 cases with SCZ and 5, 138 controls) using both standard GWAS arrays and exome arrays. In comparison to CNVs detected using GWAS arrays, we observed high sensitivity and specificity for detecting genic CNVs ≥400 kb including known pathogentic CNVs along with replicating the literature finding that cases with SCZ had greater enrichment for genic CNVs. Our data confirm the association of SCZ with 16p11.2 duplications and 22q11.2 deletions and suggest a novel association with deletions at 11q12.2. Our results suggest the utility of exome focused arrays in surveying large genic CNVs in very large samples; and thereby open the door for new opportunities such as conducting well-powered CNV assessment and comparisons between different diseases. The use of a single platform also minimizes potential confounding factors that could impact accurate detection.
schizophrenia; copy number variation; structural variation; genotyping; Illumina; exome array
mRNA synthesis, processing, and destruction involve a complex series of molecular steps that are incompletely understood. Because the RNA intermediates in each of these steps have finite lifetimes, extensive mechanistic and dynamical information is encoded in total cellular RNA. Here we report the development of SnapShot-Seq, a set of computational methods that allow the determination of in vivo rates of pre-mRNA synthesis, splicing, intron degradation, and mRNA decay from a single RNA-Seq snapshot of total cellular RNA. SnapShot-Seq can detect in vivo changes in the rates of specific steps of splicing, and it provides genome-wide estimates of pre-mRNA synthesis rates comparable to those obtained via labeling of newly synthesized RNA. We used SnapShot-Seq to investigate the origins of the intrinsic bimodality of metazoan gene expression levels, and our results suggest that this bimodality is partly due to spillover of transcriptional activation from highly expressed genes to their poorly expressed neighbors. SnapShot-Seq dramatically expands the information obtainable from a standard RNA-Seq experiment.
Major international projects are now underway aimed at creating a comprehensive catalog of all genes responsible for the initiation and progression of cancer. These studies involve sequencing of matched tumor–normal samples followed by mathematical analysis to identify those genes in which mutations occur more frequently than expected by random chance. Here, we describe a fundamental problem with cancer genome studies: as the sample size increases, the list of putatively significant genes produced by current analytical methods burgeons into the hundreds. The list includes many implausible genes (such as those encoding olfactory receptors and the muscle protein titin), suggesting extensive false positive findings that overshadow true driver events. Here, we show that this problem stems largely from mutational heterogeneity and provide a novel analytical methodology, MutSigCV, for resolving the problem. We apply MutSigCV to exome sequences from 3,083 tumor-normal pairs and discover extraordinary variation in (i) mutation frequency and spectrum within cancer types, which shed light on mutational processes and disease etiology, and (ii) mutation frequency across the genome, which is strongly correlated with DNA replication timing and also with transcriptional activity. By incorporating mutational heterogeneity into the analyses, MutSigCV is able to eliminate most of the apparent artefactual findings and allow true cancer genes to rise to attention.
A number of copy number variants (CNVs) have been suggested as susceptibility factors for schizophrenia. For some of these the data remain equivocal, and the frequency in individuals with schizophrenia is uncertain.
To determine the contribution of CNVs at 15 schizophrenia-associated loci (a) using a large new data-set of patients with schizophrenia (n = 6882) and controls (n = 6316), and (b) combining our results with those from previous studies.
We used Illumina microarrays to analyse our data. Analyses were restricted to 520 766 probes common to all arrays used in the different data-sets.
We found higher rates in participants with schizophrenia than in controls for 13 of the 15 previously implicated CNVs. Six were nominally significantly associated (P<0.05) in this new data-set: deletions at 1q21.1, NRXN1, 15q11.2 and 22q11.2 and duplications at 16p11.2 and the Angelman/Prader-Willi Syndrome (AS/PWS) region. All eight AS/PWS duplications in patients were of maternal origin. When combined with published data, 11 of the 15 loci showed highly significant evidence for association with schizophrenia (P<4.1×10–4).
We strengthen the support for the majority of the previously implicated CNVs in schizophrenia. About 2.5% of patients with schizophrenia and 0.9% of controls carry a large, detectable CNV at one of these loci. Routine CNV screening may be clinically appropriate given the high rate of known deleterious mutations in the disorder and the comorbidity associated with these heritable mutations.
De novo mutation plays an important role in Autism Spectrum Disorders (ASDs). Notably, pathogenic copy number variants (CNVs) are characterized by high mutation rates. We hypothesize that hypermutability is a property of ASD genes, and may also include nucleotide-substitution hotspots. We investigated global patterns of germline mutation by whole genome sequencing of monozygotic twins concordant for ASD and their parents. Mutation rates varied widely throughout the genome (by 100-fold) and could be explained by intrinsic characteristics of DNA sequence and chromatin structure. Dense clusters of mutations within individual genomes were attributable to compound mutation or gene conversion. Hypermutability was a characteristic of genes involved in ASD and other diseases. In addition, genes impacted by mutations in this study were associated with ASD in independent exome-sequencing datasets. Our findings suggest that regional hypermutation is a significant factor shaping patterns of genetic variation and disease risk in humans.
Large and rare copy number variants (CNVs) at several loci have been shown to increase risk for schizophrenia. Aiming to discover novel susceptibility CNV loci, we analyzed 6882 cases and 11 255 controls genotyped on Illumina arrays, most of which have not been used for this purpose before. We identified genes enriched for rare exonic CNVs among cases, and then attempted to replicate the findings in additional 14 568 cases and 15 274 controls. In a combined analysis of all samples, 12 distinct loci were enriched among cases with nominal levels of significance (P < 0.05); however, none would survive correction for multiple testing. These loci include recurrent deletions at 16p12.1, a locus previously associated with neurodevelopmental disorders (P = 0.0084 in the discovery sample and P = 0.023 in the replication sample). Other plausible candidates include non-recurrent deletions at the glutamate transporter gene SLC1A1, a CNV locus recently suggested to be involved in schizophrenia through linkage analysis, and duplications at 1p36.33 and CGNL1. A burden analysis of large (>500 kb), rare CNVs showed a 1.2% excess in cases after excluding known schizophrenia-associated loci, suggesting that additional susceptibility loci exist. However, even larger samples are required for their discovery.
Summary: zCall is a variant caller specifically designed for calling rare single-nucleotide polymorphisms from array-based technology. This caller is implemented as a post-processing step after a default calling algorithm has been applied. The algorithm uses the intensity profile of the common allele homozygote cluster to define the location of the other two genotype clusters. We demonstrate improved detection of rare alleles when applying zCall to samples that have both Illumina Infinium HumanExome BeadChip and exome sequencing data available.
Supplementary data are available at Bioinformatics online.
Tens of millions of base pairs of euchromatic human genome sequence, including many protein-coding genes, have no known location in the human genome. We describe an approach for localizing the human genome's missing pieces by utilizing the patterns of genome sequence variation created by population admixture. We mapped the locations of 70 scaffolds spanning four million base pairs of the human genome's unplaced euchromatic sequence, including more than a dozen protein-coding genes, and identified eight large novel inter-chromosomal segmental duplications. We find that most of these sequences are hidden in the genome's heterochromatin, particularly its pericentromeric regions. Many cryptic, pericentromeric genes are expressed in RNA and have been maintained intact for millions of years while their expression patterns diverged from those of paralogous genes elsewhere in the genome. We describe how knowledge of the locations of these sequences can inform disease association and genome biology studies.
Red blood cell, white blood cell, and platelet measures, including their count, sub-type and volume, are important diagnostic and prognostic clinical parameters for several human diseases. To identify novel loci associated with hematological traits, and compare the architecture of these phenotypes between ethnic groups, the CARe Project genotyped 49,094 single nucleotide polymorphisms (SNPs) that capture variation in ~2,100 candidate genes in DNA of 23,439 Caucasians and 7,112 African Americans from five population-based cohorts. We found strong novel associations between erythrocyte phenotypes and the glucose-6 phosphate dehydrogenase (G6PD) A-allele in African Americans (rs1050828, P < 2.0 × 10−13, T-allele associated with lower red blood cell count, hemoglobin, and hematocrit, and higher mean corpuscular volume), and between platelet count and a SNP at the tropomyosin-4 (TPM4) locus (rs8109288, P = 3.0 × 10−7 in Caucasians; P = 3.0 × 10−7 in African Americans, T-allele associated with lower platelet count). We strongly replicated many genetic associations to blood cell phenotypes previously established in Caucasians. A common variant of the α-globin (HBA2-HBA1) locus was associated with red blood cell traits in African Americans, but not in Caucasians (rs1211375, P < 7 × 10−8, A-allele associated with lower hemoglobin, mean corpuscular hemoglobin, and mean corpuscular volume). Our results show similarities but also differences in the genetic regulation of hematological traits in European- and African-derived populations, and highlight the role of natural selection in shaping these differences.
Genome sequencing studies indicate that all humans carry many genetic variants predicted to cause loss of function (LoF) of protein-coding genes, suggesting unexpected redundancy in the human genome. Here we apply stringent filters to 2,951 putative LoF variants obtained from 185 human genomes to determine their true prevalence and properties. We estimate that human genomes typically contain ~100 genuine LoF variants with ~20 genes completely inactivated. We identify rare and likely deleterious LoF alleles, including 26 known and 21 predicted severe disease-causing variants, as well as common LoF variants in non-essential genes. We describe functional and evolutionary differences between LoF-tolerant and recessive disease genes, and a method for using these differences to prioritize candidate genes found in clinical sequencing studies.
Crohn disease is a complex, multigenic, chronic inflammatory bowel disease of uncertain etiology. Recent advances in genetics, including high-throughput single-nucleotide polymorphism typing platforms and deep sequencing technologies have begun to shed light upon disease predisposition and pathogenesis. Autophagy is emerging as a key player in both innate and adaptive immunity, as well as tissue homeostasis and development in the gut. Here we describe our recent studies into the Crohn disease-associated Immunity-Related GTPase family, M (IRGM) gene and our discovery of a large risk-conferring upstream deletion. We discuss the effects of this deletion upon expression levels of IRGM alleles and how tissue-specific expression might be affected by the promoter polymorphism. In addition, we comment upon the potential roles of IRGM in autophagy of intracellular pathogens, and the challenges ahead for further elucidating IRGM function.
Crohn disease; inflammation; infection; bacteria; host-pathogen interaction; innate immunity
Genomic structural variants (SVs) are abundant in humans, differing from other variation classes in extent, origin, and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (i.e., copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analyzing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.
Waist-hip ratio (WHR) is a measure of body fat distribution and a predictor of metabolic consequences independent of overall adiposity. WHR is heritable, but few genetic variants influencing this trait have been identified. We conducted a meta-analysis of 32 genome-wide association studies for WHR adjusted for body mass index (comprising up to 77,167 participants), following up 16 loci in an additional 29 studies (comprising up to 113,636 subjects). We identified 13 new loci in or near RSPO3, VEGFA, TBX15-WARS2, NFE2L3, GRB14, DNM3-PIGC, ITPR2-SSPN, LY86, HOXC13, ADAMTS9, ZNRF3-KREMEN1, NISCH-STAB1 and CPEB4 (P = 1.9 × 10−9 to P = 1.8 × 10−40) and the known signal at LYPLAL1. Seven of these loci exhibited marked sexual dimorphism, all with a stronger effect on WHR in women than men (P for sex difference = 1.9 × 10−3 to P = 1.2 × 10−13). These findings provide evidence for multiple loci that modulate body fat distribution independent of overall adiposity and reveal strong gene-by-sex interactions.
Obesity is globally prevalent and highly heritable, but the underlying
genetic factors remain largely elusive. To identify genetic loci for
obesity-susceptibility, we examined associations between body mass index (BMI)
and ~2.8 million SNPs in up to 123,865 individuals, with targeted follow-up of
42 SNPs in up to 125,931 additional individuals. We confirmed 14 known
obesity-susceptibility loci and identified 18 new loci associated with BMI
(P<5×10−8), one of which
includes a copy number variant near GPRC5B. Some loci
(MC4R, POMC, SH2B1, BDNF) map near key hypothalamic
regulators of energy balance, and one is near GIPR, an incretin
receptor. Furthermore, genes in other newly-associated loci may provide novel
insights into human body weight regulation.
Family studies of individual tissues have shown that gene expression traits are genetically heritable. Here, we investigate cis and trans components of heritability both within and across tissues by applying variance-components methods to 722 Icelanders from family cohorts, using identity-by-descent (IBD) estimates from long-range phased genome-wide SNP data and gene expression measurements for ∼19,000 genes in blood and adipose tissue. We estimate the proportion of gene expression heritability attributable to cis regulation as 37% in blood and 24% in adipose tissue. Our results indicate that the correlation in gene expression measurements across these tissues is primarily due to heritability at cis loci, whereas there is little sharing of trans regulation across tissues. One implication of this finding is that heritability in tissues composed of heterogeneous cell types is expected to be more dominated by cis regulation than in tissues composed of more homogeneous cell types, consistent with our blood versus adipose results as well as results of previous studies in lymphoblastoid cell lines. Finally, we obtained similar estimates of the cis components of heritability using IBD between unrelated individuals, indicating that transgenerational epigenetic inheritance does not contribute substantially to the “missing heritability” of gene expression in these tissue types.
An important goal in biology is to understand how genotype affects gene expression. Because gene expression varies across tissues, the relationship between genotype and gene expression may be tissue-specific. In this study, we used heritability approaches to study the regulation of gene expression in two tissue types, blood and adipose tissue, as well as the regulation of gene expression that is shared across these tissues. Heritability can be partitioned into cis and trans effects by assessing identity-by-descent (IBD) at the genomic location close to the expressed gene or genome-wide, respectively, and applying variance-components methods to partition the heritability of each gene. We estimated the proportion of gene expression heritability explained by cis regulation as 37% in blood and 24% in adipose tissue. Notably, the heritability shared across tissue types was primarily due to cis regulation. Thus, the relative contribution of cis versus trans regulation is expected to increase with the number of cell types present in the tissue being assayed, just as observed in our study and in a comparison to previous work on lymphoblastoid cell lines (LCL). We specifically ruled out a substantial contribution of transgenerational epigenetic inheritance to heritability of gene expression in these cohorts by repeating our heritability analyses using segments shared IBD in distantly related Icelanders.