Despite their nearly identical genomes, males and females differ in risk, incidence, prevalence, severity and age-at-onset of many diseases. Sexual dimorphism is also seen in human autosomal gene expression, and has largely been explored by examining the contribution of genotype-by-sex interactions to variation in gene expression.
In this study, we use data from a mixture of pedigree and unrelated individuals with verified European ancestry to investigate the sex-specific genetic architecture of gene expression measured in whole blood across n=1048 males and n=1005 females by treating gene expression intensities in the sexes as two distinct traits and estimating the genetic correlation (r
G) between them. These correlations measure the similarity of the combined additive genetic effects of all single-nucleotide polymorphisms across the autosomal chromosomes, and thus the level of common genetic control of gene expression across the sexes. Genetic correlations are estimated across the sexes for the expression levels of 12,528 autosomal gene expression probes using bivariate GREML, and tested for differences in autosomal genetic control of gene expression across the sexes. Overall, no deviation of the distribution of test statistics is observed from that expected under the null hypothesis of a common autosomal genetic architecture for gene expression across the sexes.
These results suggest that males and females share the same common genetic control of gene expression.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-016-1111-0) contains supplementary material, which is available to authorized users.
Gene expression; Genetic correlation; Sex-specific genetic architecture
Lifestyle modifications are first‐line measures for cardiovascular disease prevention. Whether lifestyle intervention also preserves cardiovascular health is less clear. Our study examined the role of a Health Partner–administered lifestyle intervention on metrics of ideal cardiovascular health.
Methods and Results
A total of 711 university employees (48±11 years; 66% women, 72% Caucasian/22.5% African Americans) enrolled in a program that promoted healthier lifestyles at Emory University (Atlanta, GA). Anthropometric, laboratory, and physical activity measurements were performed at baseline and at 6 months, 1 year, and 2 years of follow‐up. Results were utilized by the Health Partner to generate a personalized plan aimed at meeting ideal health metrics. Compared to baseline, at each of the 6‐month, 1‐year, and 2‐year follow‐up visits, systolic blood pressure was lower by 3.6, 4.6, and 3.3 mm Hg (P<0.001), total cholesterol decreased by 5.3, 6.5, and 6.4 mg/dL (P<0.001), body mass index declined by 0.33, 0.45, and 0.38 kg/m2 (P<0.001), and the percentage of smokers decreased by 1.3%, 3.5%, and 3.5% (P<0.01), respectively. Changes were greater in those with greater abnormalities at baseline. Finally, the American Heart Association “Life's Simple 7” ideal cardiovascular health score increased by 0.28, 0.40, and 0.33 at 6 month, 1 year, and 2 years, respectively, compared to baseline visit.
A personalized, goal‐directed Health Partner intervention significantly improved the cardiometabolic risk profile and metrics of cardiovascular health. These effects were evident at 6 months following enrollment and were sustained for 2 years. Whether the Health Partner intervention improves long‐term morbidity and mortality and is cost‐effective needs further investigation.
cardiovascular risk; health education; health partner; lifestyle; prevention; Exercise; Cardiovascular Disease
Histiocytoid cardiomyopathy (Histiocytoid CM) is a rare form of cardiomyopathy observed predominantly in newborn females that is fatal unless treated early in life. We have performed whole exome sequencing on five parent-proband trios and identified nuclear-encoded mitochondrial protein mutations in three cases. Two probands had de novo non-sense mutations in the second exon of the X-linked nuclear gene NDUFB11, which has not previously been implicated in any disease, despite evidence that deficiency for other mitochondrial electron transport complex I members leads to cardiomyopathy. A third proband was doubly heterozygous for inherited rare variants in additional components of complex I, NDUFAF2 and NDUFB9, confirming that Histiocytoid CM is genetically heterogeneous. In a fourth case, the proband with Histiocytoid CM inherited a mitochondrial mutation from her heteroplasmic mother, as did her brother who presented with cardiac arrhythmia. Strong candidate recessive or compound heterozygous variants were not found for this individual or for the fifth case. Although NDUFB11 has not been implicated before in cardiac pathology, morpholino-mediated knockdown of Ndufb11 in zebrafish embryos generated defective cardiac tissue with looping defects, which confirms the causative role of NDUFB11 in cardiac pathology. Therefore, the NDUFB11 mutation represents a genetic basis of this heterogeneous disease.
histiocytoid cardiomyopathy; NDUFB11; zebrafish; morpholinos; whole exome sequencing; de novo mutation
Bacteria colonize cystic fibrosis (CF) airways, and while T cells with appropriate antigen specificity are present in draining lymph nodes, they are conspicuously absent from the lumen. To account for this absence, we hypothesized that polymorphonuclear neutrophils (PMNs), recruited massively into the CF airway lumen and actively exocytosing primary granules, also suppress T-cell function therein. Programmed Death-Ligand 1 (PD-L1), which exerts T-cell suppression at a late step, was expressed bimodally on CF airway PMNs, delineating PD-L1hi and PD-L1lo subsets, while healthy control (HC) airway PMNs were uniformly PD-L1hi. Blood PMNs incubated in CF airway fluid lost PD-L1 over time, and in coculture, antibody blockade of PD-L1 failed to inhibit the suppression of T-cell proliferation by CF airway PMNs. In contrast with PD-L1, arginase 1 (Arg1), which exerts T-cell suppression at an early step, was uniformly high on CF and HC airway PMNs. However, arginase activity was high in CF airway fluid and minimal in HC airway fluid, consistent with the fact that Arg1 activation requires primary granule exocytosis, which occurs in CF, but not HC, airway PMNs. In addition, Arg1 expression on CF airway PMNs correlated negatively with lung function and positively with arginase activity in CF airway fluid. Finally, combined treatment with arginase inhibitor and arginine rescued the suppression of T-cell proliferation by CF airway fluid. Thus, Arg1 and PD-L1 are dynamically modulated upon PMN migration into human airways, and, Arg1, but not PD-L1, contributes to early PMN-driven T-cell suppression in CF, likely hampering resolution of infection and inflammation.
Evolutionary developmental genetics has traditionally been conducted by two groups: Molecular evolutionists who emphasize divergence between species or higher taxa, and quantitative geneticists who study variation within species. Neither approach really comes to grips with the complexities of evolutionary transitions, particularly in light of the realization from genome-wide association studies that most complex traits fit an infinitesimal architecture, being influenced by thousands of loci. This paper discusses robustness, plasticity and lability, phenomena that we argue potentiate major evolutionary changes and provide a bridge between the conceptual treatments of macro- and micro-evolution. We offer cryptic genetic variation and conditional neutrality as mechanisms by which standing genetic variation can lead to developmental system drift and, sheltered within canalized processes, may facilitate developmental transitions and the evolution of novelty. Synthesis of the two dominant perspectives will require recognition that adaptation, divergence, drift and stability all depend on similar underlying quantitative genetic processes—processes that cannot be fully observed in continuously varying visible traits.
evolution; development; genetics; quantitative genetics; cryptic genetic variation
Prospective epidemiological studies found that generalized anxiety disorder (GAD) can impair immune function and increase risk for cardiovascular disease or events. Mechanisms underlying the physiological reverberations of anxiety, however, are still elusive. Hence, we aimed to investigate molecular processes mediating effects of anxiety on physical health using blood gene expression profiles of 336 community participants (157 anxious and 179 control). We examined genome-wide differential gene expression in anxiety, as well as associations between nine major modules of co-regulated transcripts in blood gene expression and anxiety. No significant differential expression was observed in women, but 631 genes were differentially expressed between anxious and control men at the false discovery rate of 0.1 after controlling for age, body mass index, race, and batch effect. Gene set enrichment analysis (GSEA) revealed that genes with altered expression levels in anxious men were involved in response of various immune cells to vaccination and to acute viral and bacterial infection, and in a metabolic network affecting traits of metabolic syndrome. Further, we found one set of 260 co-regulated genes to be significantly associated with anxiety in men after controlling for the relevant covariates, and demonstrate its equivalence to a component of the stress-related conserved transcriptional response to adversity profile. Taken together, our results suggest potential molecular pathways that can explain negative effects of GAD observed in epidemiological studies. Remarkably, even mild anxiety, which most of our participants had, was associated with observable changes in immune-related gene expression levels. Our findings generate hypotheses and provide incremental insights into molecular mechanisms mediating negative physiological effects of GAD.
DICER1 is an enzyme that generates mature microRNAs (miRNAs), which regulate gene expression post-transcriptionally in brain and other tissues and is involved in synaptic maturation and plasticity. Here, through genome-wide differential gene expression survey of post-traumatic stress disorder (PTSD) with comorbid depression (PTSD&Dep), we find that blood DICER1 expression is significantly reduced in cases versus controls, and replicate this in two independent cohorts. Our follow-up studies find that lower blood DICER1 expression is significantly associated with increased amygdala activation to fearful stimuli, a neural correlate for PTSD. Additionally, a genetic variant in the 3′ un-translated region of DICER1, rs10144436, is significantly associated with DICER1 expression and with PTSD&Dep, and the latter is replicated in an independent cohort. Furthermore, genome-wide differential expression survey of miRNAs in blood in PTSD&Dep reveals miRNAs to be significantly downregulated in cases versus controls. Together, our novel data suggest DICER1 plays a role in molecular mechanisms of PTSD&Dep through the DICER1 and the miRNA regulation pathway.
DICER1 is required for the maturation of miRNAs which regulate expression of thousands of genes. Here the authors show significantly reduced levels of DICER1 in individuals having post-traumatic stress disorder and comorbid depression suggestive of a role in the molecular mechanism of the condition.
We describe the Wellness and Health Omics Linked to the Environment (WHOLE) personalized medicine profile for a 50-year-old Caucasian male living in Atlanta, Georgia. Based on the principle that genomic medicine will be most effective when presented in the context of an individual’s clinical and lifestyle data, we propose the use of a “risk radar” that summarizes health risks in eight domains. Rather than providing overwhelming lists of potentially deleterious genetic variants, we argue that profiles should be palatable, actionable, reproducible, and teachable: the PART principle. Genetic risk scores for this individual are strikingly concordant for his height, body mass index (BMI), waist hip ration (WHR), and cholesterol, and blood transcriptome data agrees with and complements his complete blood counts. Despite enjoying currently good health, his risk radar highlights metabolic disease as his major health concern.
personalized medicine; genetic risk score; transcriptome profile; wellness
Personalized medicine is predicated on the notion that individual biochemical and genomic profiles are relatively constant in times of good health and to some extent predictive of disease or therapeutic response. We report a pilot study quantifying gene expression and methylation profile consistency over time, addressing the reasons for individual uniqueness, and its relation to N = 1 phenotypes.
Whole blood samples from four African American women, four Caucasian women, and four Caucasian men drawn from the Atlanta Center for Health Discovery and Well Being study at three successive 6-month intervals were profiled by RNA-Seq, miRNA-Seq, and Illumina Methylation 450 K arrays. Standard regression approaches were used to evaluate the proportion of variance for each type of omic measure among individuals, and to quantify correlations among measures and with clinical attributes related to wellness.
Longitudinal omic profiles were in general highly consistent over time, with an average of 67 % variance in transcript abundance, 42 % in CpG methylation level (but 88 % for the most differentiated CpG per gene), and 50 % in miRNA abundance among individuals, which are all comparable to 74 % variance among individuals for 74 clinical traits. One third of the variance could be attributed to differential blood cell type abundance, which was also fairly stable over time, and a lesser amount to expression quantitative trait loci (eQTL) effects. Seven conserved axes of covariance that capture diverse aspects of immune function explained over half of the variance. These axes also explained a considerable proportion of individually extreme transcript abundance, namely approximately 100 genes that were significantly up-regulated or down-regulated in each person and were in some cases enriched for relevant gene activities that plausibly associate with clinical attributes. A similar fraction of genes had individually divergent methylation levels, but these did not overlap with the transcripts, and fewer than 20 % of genes had significantly correlated methylation and gene expression.
People express an “omic personality” consisting of peripheral blood transcriptional and epigenetic profiles that are constant over the course of a year and reflect various types of immune activity. Baseline genomic profiles can provide a window into the molecular basis of traits that might be useful for explaining medical conditions or guiding personalized health decisions.
Electronic supplementary material
The online version of this article (doi:10.1186/s13073-015-0209-4) contains supplementary material, which is available to authorized users.
Continued exposure to malaria-causing parasites in endemic regions of malaria induces significant levels of acquired immunity in adult individuals. A better understanding of the transcriptional basis for this acquired immunological response may provide insight into how the immune system can be boosted during vaccination, and into why infected individuals differ in symptomology.
Peripheral blood gene expression profiles of 9 semi-immune volunteers from a Plasmodium vivax malaria prevalent region (Buenaventura, Colombia) were compared to those of 7 naïve individuals from a region with no reported transmission of malaria (Cali, Colombia) after a controlled infection mosquito bite challenge with P. vivax. A Fluidigm nanoscale quantitative RT-PCR array was used to survey altered expression of 96 blood informative transcripts at 7 timepoints after controlled infection, and RNASeq was used to contrast pre-infection and early parasitemia timepoints. There was no evidence for transcriptional changes prior to the appearance of blood stage parasites at day 12 or 13, at which time there was a strong interferon response and, unexpectedly, down-regulation of transcripts related to inflammation and innate immunity. This differential expression was confirmed with RNASeq, which also suggested perturbations of aspects of T cell function and erythropoiesis. Despite differences in clinical symptoms between the semi-immune and malaria naïve individuals, only subtle differences in their transcriptomes were observed, although 175 genes showed significantly greater induction or repression in the naïve volunteers from Cali.
Gene expression profiling of whole blood reveals the type and duration of the immune response to P. vivax infection, and highlights a subset of genes that may mediate adaptive immunity.
Plasmodium vivax malaria is a debilitating, occasionally life-threatening, and economically burdensome disease in Central Latin America, where 70%- 80% of the population lives with the risk of infection. We performed a gene expression profiling experiment taking advantage of a previously described sporozoite challenge experiment in Cali, Colombia that reported more severe malaria symptoms in subjects who have never experienced malaria. We show that no major differences are seen in the transcriptomes of uninfected naïve and semi-immune volunteers prior to infection, but differential expression of both neutrophil and interferon-related genes was evident at onset of malaria. Several hundred genes showed a stronger response in the naïve individuals just as parasites appear in the peripheral blood, and these fall into several pathways of interest. These findings show how information from gene expression profiling of whole blood can reveal the type and duration of the immune response to P. vivax infection, and highlights a subset of genes that may mediate adaptive immunity in chronically exposed individuals.
Expression quantitative trait locus analysis has emerged as an important component of efforts to understand how genetic polymorphisms influence disease risk and is poised to make contributions to translational medicine. Here we review how expression quantitative trait locus analysis is aiding the identification of which gene(s) within regions of association are causal for a disease or phenotypic trait; the narrowing down of the cell types or regulators involved in the etiology of disease; the characterization of drivers and modifiers of cancer; and our understanding of how different environments and cellular contexts can modify gene expression. We also introduce the concept of transcriptional risk scores as a means of refining estimates of individual liability to disease based on targeted profiling of the transcripts that are regulated by polymorphisms jointly associated with disease and gene expression.
Genome-wide association studies have greatly improved our understanding of the genetic basis of disease risk. The fact that they tend not to identify more than a fraction of the specific causal loci has led to divergence of opinion over whether most of the variance is hidden as numerous rare variants of large effect, or common variants of very small effect. Here I review 20 arguments for and against each of these models of the genetic basis of complex traits, and conclude that both classes of effect can be reconciled readily.
Gene expression variation provides a read-out of both genetic and environmental influences on gene activity. Geographical, genomic and sociogenomic studies have highlighted how life circumstances of an individual modify the expression of hundreds and in some cases thousands of genes in a co-ordinated manner. This review places such results in the context of a conserved set of 90 transcripts known as Blood Informative Transcripts (BIT) that capture the major conserved components of variation in the peripheral blood transcriptome. Pathophysiological states are also shown to associate with the perturbation of transcript abundance along the major axes. Discussion of false negative rates leads us to argue that simple significance thresholds provide a biased perspective on assessment of differential expression that may cloud the interpretation of studies with small sample sizes.
Axes of variation; sociogenomics; geographical genomics; eQTL; differential expression
The Center for Health Discovery and Wellbeing (CHDWB) is an academic program designed to evaluate the efficacy of clinical self-knowledge and health partner counseling for development and maintenance of healthy behaviors. This paper reports on the change in health profiles for over 90 traits, measured in 382 participants over three visits in the 12 months following enrolment. Significant changes in the desired direction of improved health are observed for many traits related to cardiovascular health, including BMI, blood pressure, cholesterol, and arterial stiffness, as well as for summary measures of physical and mental health. The changes are most notable for individuals in the upper quartile of baseline risk, many of whom showed a positive correlated response across clinical categories. By contrast, individuals who start with more healthy profiles do not generally show significant improvements and only a modest impact of targeting specific health attributes was observed. Overall, the CHDWB model shows promise as an effective intervention particularly for individuals at high risk for cardiovascular disease.
personalized medicine; health partner; chronic disease risk; lifestyle intervention
Epistasis is the phenomenon whereby one polymorphism’s effect on a trait depends on other polymorphisms present in the genome. The extent to which epistasis influences complex traits1 and contributes to their variation2,3 is a fundamental question in evolution and human genetics. Though often demonstrated in artificial gene manipulation studies in model organisms4,5, and some examples have been reported in other species6, few examples exist for epistasis amongst natural polymorphisms in human traits7,8. Its absence from empirical findings may simply be due to low incidence in the genetic control of complex traits2,3, but an alternative view is that it has previously been too technically challenging to detect due to statistical and computational issues9. Here we show that, using advanced computation10 and a gene expression study design, many instances of epistasis are found between common single nucleotide polymorphisms (SNPs). In a cohort of 846 individuals with 7339 gene expression levels measured in peripheral blood, we found 501 significant pairwise interactions between common SNPs influencing the expression of 238 genes (p < 2.91 × 10−16). Replication of these interactions in two independent data sets11,12 showed both concordance of direction of epistatic effects (p = 5.56 ×10−31) and enrichment of interaction p-values, with 30 being significant at a conservative threshold of p < 0.05/501. Forty-four of the genetic interactions are located within 2Mb of regions of known physical chromosome interactions13 (p = 1.8 × 10−10). Epistatic networks of three SNPs or more influence the expression levels of 129 genes, whereby one cis-acting SNP is modulated by several trans-acting SNPs. For example MBNL1 is influenced by an additive effect at rs13069559 which itself is masked by trans-SNPs on 14 different chromosomes, with nearly identical genotype-phenotype (GP) maps for each cis-trans interaction. This study presents the first evidence for multiple instances of segregating common polymorphisms interacting to influence human traits.
Craniosynostosis, the premature fusion of one or more skull sutures, occurs in approximately 1 in 2500 infants, with the majority of cases non-syndromic and of unknown etiology. Two common reasons proposed for premature suture fusion are abnormal compression forces on the skull and rare genetic abnormalities. Our goal was to evaluate whether different sub-classes of disease can be identified based on total gene expression profiles. RNA-Seq data were obtained from 31 human osteoblast cultures derived from bone biopsy samples collected between 2009 and 2011, representing 23 craniosynostosis fusions and 8 normal cranial bones or long bones. No differentiation between regions of the skull was detected, but variance component analysis of gene expression patterns nevertheless supports transcriptome-based classification of craniosynostosis. Cluster analysis showed 4 distinct groups of samples; 1 predominantly normal and 3 craniosynostosis subtypes. Similar constellations of sub-types were also observed upon re-analysis of a similar dataset of 199 calvarial osteoblast cultures. Annotation of gene function of differentially expressed transcripts strongly implicates physiological differences with respect to cell cycle and cell death, stromal cell differentiation, extracellular matrix (ECM) components, and ribosomal activity. Based on these results, we propose non-syndromic craniosynostosis cases can be classified by differences in their gene expression patterns and that these may provide targets for future clinical intervention.
Non-syndromic craniosynostosis; RNA-Seq; Transcriptome profile; Personalized medicine; Biomarkers.
The switch to a modern lifestyle in recent decades has coincided with a rapid increase in prevalence of obesity and other diseases. These shifts in prevalence could be explained by the release of genetic susceptibility for disease in the form of gene-by-environment (GxE) interactions. Yet, the detection of interaction effects requires large sample sizes, little replication has been reported, and a few studies have demonstrated environmental effects only after summing the risk of GWAS alleles into genetic risk scores (GRSxE). We performed extensive simulations of a quantitative trait controlled by 2500 causal variants to inspect the feasibility to detect gene-by-environment interactions in the context of GWAS. The simulated individuals were assigned either to an ancestral or a modern setting that alters the phenotype by increasing the effect size by 1.05–2-fold at a varying fraction of perturbed SNPs (from 1 to 20%). We report two main results. First, for a wide range of realistic scenarios, highly significant GRSxE is detected despite the absence of individual genotype GxE evidence at the contributing loci. Second, an increase in phenotypic variance after environmental perturbation reduces the power to discover susceptibility variants by GWAS in mixed cohorts with individuals from both ancestral and modern environments. We conclude that a pervasive presence of gene-by-environment effects can remain hidden even though it contributes to the genetic architecture of complex traits.
gene-by-environment; environmental perturbation; modern lifestyle; complex disease; genetic risk score; decanalization; GWAS; obesity
Single-cell analysis has the potential to provide us with a host of new knowledge about biological systems, but it comes with the challenge of correctly interpreting the biological information. While emerging techniques have made it possible to measure inter-cellular variability at the transcriptome level, no consensus yet exists on the most appropriate method of data analysis of such single cell data. Methods for analysis of transcriptional data at the population level are well established but are not well suited to single cell analysis due to their dependence on population averages. In order to address this question, we have systematically tested combinations of methods for primary data analysis on single cell transcription data generated from two types of primary immune cells, neutrophils and T lymphocytes. Cells were obtained from healthy individuals, and single cell transcript expression data was obtained by a combination of single cell sorting and nanoscale quantitative real time PCR (qRT-PCR) for markers of cell type, intracellular signaling, and immune functionality. Gene expression analysis was focused on hierarchical clustering to determine the existence of cellular subgroups within the populations. Nine combinations of criteria for data exclusion and normalization were tested and evaluated. Bimodality in gene expression indicated the presence of cellular subgroups which were also revealed by data clustering. We observed evidence for two clearly defined cellular subtypes in the neutrophil populations and at least two in the T lymphocyte populations. When normalizing the data by different methods, we observed varying outcomes with corresponding interpretations of the biological characteristics of the cell populations. Normalization of the data by linear standardization taking into account technical effects such as plate effects, resulted in interpretations that most closely matched biological expectations. Single cell transcription profiling provides evidence of cellular subclasses in neutrophils and leukocytes that may be independent of traditional classifications based on cell surface markers. The choice of primary data analysis method had a substantial effect on the interpretation of the data. Adjustment for technical effects is critical to prevent misinterpretation of single cell transcript data.
Single cell analysis; Data processing; Fluidigm; Gene expression
Genetic risk scores have been developed for coronary artery disease and atherosclerosis, but are not predictive of adverse cardiovascular events. We asked whether peripheral blood expression profiles may be predictive of acute myocardial infarction (AMI) and/or cardiovascular death.
Peripheral blood samples from 338 subjects aged 62 ± 11 years with coronary artery disease (CAD) were analyzed in two phases (discovery N = 175, and replication N = 163), and followed for a mean 2.4 years for cardiovascular death. Gene expression was measured on Illumina HT-12 microarrays with two different normalization procedures to control technical and biological covariates. Whole genome genotyping was used to support comparative genome-wide association studies of gene expression. Analysis of variance was combined with receiver operating curve and survival analysis to define a transcriptional signature of cardiovascular death.
In both phases, there was significant differential expression between healthy and AMI groups with overall down-regulation of genes involved in T-lymphocyte signaling and up-regulation of inflammatory genes. Expression quantitative trait loci analysis provided evidence for altered local genetic regulation of transcript abundance in AMI samples. On follow-up there were 31 cardiovascular deaths. A principal component (PC1) score capturing covariance of 238 genes that were differentially expressed between deceased and survivors in the discovery phase significantly predicted risk of cardiovascular death in the replication and combined samples (hazard ratio = 8.5, P < 0.0001) and improved the C-statistic (area under the curve 0.82 to 0.91, P = 0.03) after adjustment for traditional covariates.
A specific blood gene expression profile is associated with a significant risk of death in Caucasian subjects with CAD. This comprises a subset of transcripts that are also altered in expression during acute myocardial infarction.
We have developed a novel structure-based evaluation for missense variants that explicitly models protein structure and amino acid properties to predict the likelihood that a variant disrupts protein function. A structural disruption score (SDS) is introduced as a measure to depict the likelihood that a case variant is functional. The score is constructed using characteristics that distinguish between causal and neutral variants within a group of proteins. The SDS score is correlated with standard sequence-based deleteriousness, but shows promise for improving discrimination between neutral and causal variants at less conserved sites. The prediction was performed on 3-dimentional structures of 57 gene products whose homozygous SNPs were identified as case-exclusive variants in an exome sequencing study of epilepsy disorders. We contrasted the candidate epilepsy variants with scores for likely benign variants found in the EVS database, and for positive control variants in the same genes that are suspected to promote a range of diseases. To derive a characteristic profile of damaging SNPs, we transformed continuous scores into categorical variables based on the score distribution of each measurement, collected from all possible SNPs in this protein set, where extreme measures were assumed to be deleterious. A second epilepsy dataset was used to replicate the findings. Causal variants tend to receive higher sequence-based deleterious scores, induce larger physico-chemical changes between amino acid pairs, locate in protein domains, buried sites or on conserved protein surface clusters, and cause protein destabilization, relative to negative controls. These measures were agglomerated for each variant. A list of nine high-priority putative functional variants for epilepsy was generated. Our newly developed SDS protocol facilitates SNP prioritization for experimental validation.
non-synonymous single nucleotide polymorphism; missense mutation; protein structural analysis; structural disruption score; variant prioritization; epilepsy disorders
Systems biology is an approach to dissection of complex traits that explicitly recognizes the impact of genetic, physiological, and environmental interactions in the generation of phenotypic variation. We describe comprehensive transcriptional and metabolic profiling in Drosophila melanogaster across four diets, finding little overlap in modular architecture. Genotype and genotype-by-diet interactions are a major component of transcriptional variation (24 and 5.3% of the total variation, respectively) while there were no main effects of diet (<1%). Genotype was also a major contributor to metabolomic variation (16%), but in contrast to the transcriptome, diet had a large effect (9%) and the interaction effect was minor (2%) for the metabolome. Yet specific principal components of these molecular phenotypes measured in larvae are strongly correlated with particular metabolic syndrome-like phenotypes such as pupal weight, larval sugar content and triglyceride content, development time, and cardiac arrhythmia in adults. The second principal component of the metabolomic profile is especially informative across these traits with glycine identified as a key loading variable. To further relate this physiological variability to genotypic polymorphism, we performed evolve-and-resequence experiments, finding rapid and replicated changes in gene frequency across hundreds of loci that are specific to each diet. Adaptation to diet is thus highly polygenic. However, loci differentially transcribed across diet or previously identified by RNAi knockdown or expression QTL analysis were not the loci responding to dietary selection. Therefore, loci that respond to the selective pressures of diet cannot be readily predicted a priori from functional analyses.
metabolic syndrome; metabolomics; evolve-and-resequence; genotype-by-environment; adaptation