The simplest definition of cis-eQTLs versus trans, refers to genetic variants that affect expression in an allele specific manner, with implications on underlying mechanism. Yet, due to technical limitations of expression microarrays, the vast majority of eQTL studies performed in the last decade used a genomic distance based definition as a surrogate for cis, therefore exploring local rather than cis-eQTLs.
In this study we use RNAseq to explore allele specific expression (ASE) in adipose tissue of male and female F1 mice, produced from reciprocal crosses of C57BL/6J and DBA/2J strains. Comparison of the identified cis-eQTLs, to local-eQTLs, that were obtained from adipose tissue expression in two previous population based studies in our laboratory, yields poor overlap between the two mapping approaches, while both local-eQTL studies show highly concordant results. Specifically, local-eQTL studies show ~60% overlap between themselves, while only 15-20% of local-eQTLs are identified as cis by ASE, and less than 50% of ASE genes are recovered in local-eQTL studies. Utilizing recently published ENCODE data, we also find that ASE genes show significant bias for SNPs prevalence in DNase I hypersensitive sites that is ASE direction specific.
We suggest a new approach to analysis of allele specific expression that is more sensitive and accurate than the commonly used fisher or chi-square statistics. Our analysis indicates that technical differences between the cis and local-eQTL approaches, such as differences in genomic background or sex specificity, account for relatively small fraction of the discrepancy. Therefore, we suggest that the differences between two eQTL mapping approaches may facilitate sorting of SNP-eQTL interactions into true cis and trans, and that a considerable portion of local-eQTL may actually represent trans interactions.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-471) contains supplementary material, which is available to authorized users.
Cis; Trans; eQTL; Allele Specific Expression; Adipose; RNA-seq; DNase I hypersensitivity; DBA/2J; C57BL/6J
Obesity is a highly heritable disease driven by complex interactions between genetic and environmental factors. Human genome-wide association studies (GWAS) have identified a number of loci contributing to obesity; however, a major limitation of these studies is the inability to assess environmental interactions common to obesity. Using a systems genetics approach, we measured obesity traits, global gene expression, and gut microbiota composition in response to a high-fat/high-sucrose (HF/HS) diet of more than 100 inbred strains of mice. Here we show that HF/HS feeding promotes robust, strain-specific changes in obesity that is not accounted for by food intake and provide evidence for a genetically determined set-point for obesity. GWAS analysis identified 11 genome-wide significant loci associated with obesity traits, several of which overlap with loci identified in human studies. We also show strong relationships between genotype and gut microbiota plasticity during HF/HS feeding and identify gut microbial phylotypes associated with obesity.
The Systems Genetics Resource (SGR) (http://systems.genetics.ucla.edu) is a new open-access web application and database that contains genotypes and clinical and intermediate phenotypes from both human and mouse studies. The mouse data include studies using crosses between specific inbred strains and studies using the Hybrid Mouse Diversity Panel. SGR is designed to assist researchers studying genes and pathways contributing to complex disease traits, including obesity, diabetes, atherosclerosis, heart failure, osteoporosis, and lipoprotein metabolism. Over the next few years, we hope to add data relevant to deafness, addiction, hepatic steatosis, toxin responses, and vascular injury. The intermediate phenotypes include expression array data for a variety of tissues and cultured cells, metabolite levels, and protein levels. Pre-computed tables of genetic loci controlling intermediate and clinical phenotypes, as well as phenotype correlations, are accessed via a user-friendly web interface. The web site includes detailed protocols for all of the studies. Data from published studies are freely available; unpublished studies have restricted access during their embargo period.
database; genomics; systems biology; data integration; web services; data analysis
We have developed an association-based approach using classical inbred strains of mice in which we correct for population structure, which is very extensive in mice, using an efficient mixed-model algorithm. Our approach includes inbred parental strains as well as recombinant inbred strains in order to capture loci with effect sizes typical of complex traits in mice (in the range of 5 % of total trait variance). Over the last few years, we have typed the hybrid mouse diversity panel (HMDP) strains for a variety of clinical traits as well as intermediate phenotypes and have shown that the HMDP has sufficient power to map genes for highly complex traits with resolution that is in most cases less than a megabase. In this essay, we review our experience with the HMDP, describe various ongoing projects, and discuss how the HMDP may fit into the larger picture of common diseases and different approaches.
Mutations of the orphan transporter ABCC6 (ATP-binding cassette, subfamily C, member 6) cause the connective tissue disorder pseudoxanthoma elasticum. ABCC6 was thought to be located on the plasma membrane of liver and kidney cells.
Mouse systems genetics and bioinformatics suggested that ABCC6 deficiency affects mitochondrial gene expression. We therefore tested whether ABCC6 associates with mitochondria.
Methods and Results
We found ABCC6 in crude mitochondrial fractions and subsequently pinpointed its localization to the purified mitochondria-associated membrane fraction. Cell-surface biotinylation in hepatocytes confirmed that ABCC6 is intracellular. Abcc6-knockout mice demonstrated mitochondrial abnormalities and decreased respiration reserve capacity.
Our finding that ABCC6 localizes to the mitochondria-associated membrane has implications for its mechanism of action in normal and diseased states.
PXE; vascular calcification; ABCC6/MRP6; MAM; mitochondria; cardiovascular disease
Medical students are expected to master the ability to interpret histopathologic images, a difficult and time-consuming process. A major problem is the issue of transferring information learned from one example of a particular pathology to a new example. Recent advances in cognitive science have identified new approaches to address this problem.
We adapted a new approach for enhancing pattern recognition of basic pathologic processes in skin histopathology images that utilizes perceptual learning techniques, allowing learners to see relevant structure in novel cases along with adaptive learning algorithms that space and sequence different categories (e.g. diagnoses) that appear during a learning session based on each learner's accuracy and response time (RT). We developed a perceptual and adaptive learning module (PALM) that utilized 261 unique images of cell injury, inflammation, neoplasia, or normal histology at low and high magnification. Accuracy and RT were tracked and integrated into a “Score” that reflected students rapid recognition of the pathologies and pre- and post-tests were given to assess the effectiveness.
Accuracy, RT and Scores significantly improved from the pre- to post-test with Scores showing much greater improvement than accuracy alone. Delayed post-tests with previously unseen cases, given after 6-7 weeks, showed a decline in accuracy relative to the post-test for 1st-year students, but not significantly so for 2nd-year students. However, the delayed post-test scores maintained a significant and large improvement relative to those of the pre-test for both 1st and 2nd year students suggesting good retention of pattern recognition. Student evaluations were very favorable.
A web-based learning module based on the principles of cognitive science showed an evidence for improved recognition of histopathology patterns by medical students.
Cognitive science; dermatology; medical education; pathology; perceptual learning
ABCC6 genetic deficiency underlies Pseudoxanthoma elasticum (PXE) in humans, characterized by ectopic calcification, and early cardiac disease. The spectrum of PXE has been noted in Abcc6 deficient mice, including dystrophic cardiac calcification. We tested the role of Abcc6 in response to cardiac ischemia-reperfusion (I/R) injury.
Methods and results
To determine the role of Abcc6 in cardio-protection we induced ischemic injury in mice in vivo by occluding the left anterior descending artery (30min) followed by reperfusion (48hrs). Infarct size was increased in Abcc6 deficient mice compared to wild type controls. Additionally, an Abcc6 transgene significantly reduced infarct size on the background of a naturally occurring Abcc6 deficiency. There were no differences in cardiac calcification following I/R, but increased cardiac apoptosis was noted in Abcc6 deficient mice. Previous studies have implicated the BMP signaling pathway in directing calcification, and here we show the BMP responsive transcription factors, pSmad1/5/8 were increased in hearts of Abcc6 mice. Consistent with this finding, BMP4 and BMP9 were increased, and ALK2 and Endoglin were down-regulated in cardiac extracts from Abcc6 deficient mice versus controls.
These data identify Abcc6 as a novel modulator of cardiac myocyte survival after I/R. This cardio-protective mechanism may involve inhibition of the BMP signaling pathway, which modulates apoptosis.
ABCC6; Pseudoxanthoma elasticum; BMP signaling; apoptosis; cardiac ischemia-reperfusion (I/R)
The control of malaria in schools is receiving increasing attention, but there remains currently no consensus as to the optimal intervention strategy. This paper analyses the costs of intermittent screening and treatment (IST) of malaria in schools, implemented as part of a cluster-randomized controlled trial on the Kenyan coast.
Financial and economic costs were estimated using an ingredients approach whereby all resources required in the delivery of IST are quantified and valued. Sensitivity analysis was conducted to investigate how programme variation affects costs and to identify potential cost savings in the future implementation of IST.
The estimated financial cost of IST per child screened is US$ 6.61 (economic cost US$ 6.24). Key contributors to cost were salary costs (36%) and malaria rapid diagnostic tests (RDT) (22%). Almost half (47%) of the intervention cost comprises redeployment of existing resources including health worker time and use of hospital vehicles. Sensitivity analysis identified changes to intervention delivery that can reduce programme costs by 40%, including use of alternative RDTs and removal of supervised treatment. Cost-effectiveness is also likely to be highly sensitive to the proportion of children found to be RDT-positive.
In the current context, school-based IST is a relatively expensive malaria intervention, but reducing the complexity of delivery can result in considerable savings in the cost of intervention.
(Costs are reported in US$ 2010).
Significant advances have been made in the discovery of genes affecting bone mineral density (BMD); however, our understanding of its genetic basis remains incomplete. In the current study, genome-wide association (GWA) and co-expression network analysis were used in the recently described Hybrid Mouse Diversity Panel (HMDP) to identify and functionally characterize novel BMD genes. In the HMDP, a GWA of total body, spinal, and femoral BMD revealed four significant associations (−log10P>5.39) affecting at least one BMD trait on chromosomes (Chrs.) 7, 11, 12, and 17. The associations implicated a total of 163 genes with each association harboring between 14 and 112 genes. This list was reduced to 26 functional candidates by identifying those genes that were regulated by local eQTL in bone or harbored potentially functional non-synonymous (NS) SNPs. This analysis revealed that the most significant BMD SNP on Chr. 12 was a NS SNP in the additional sex combs like-2 (Asxl2) gene that was predicted to be functional. The involvement of Asxl2 in the regulation of bone mass was confirmed by the observation that Asxl2 knockout mice had reduced BMD. To begin to unravel the mechanism through which Asxl2 influenced BMD, a gene co-expression network was created using cortical bone gene expression microarray data from the HMDP strains. Asxl2 was identified as a member of a co-expression module enriched for genes involved in the differentiation of myeloid cells. In bone, osteoclasts are bone-resorbing cells of myeloid origin, suggesting that Asxl2 may play a role in osteoclast differentiation. In agreement, the knockdown of Asxl2 in bone marrow macrophages impaired their ability to form osteoclasts. This study identifies a new regulator of BMD and osteoclastogenesis and highlights the power of GWA and systems genetics in the mouse for dissecting complex genetic traits.
Osteoporosis is a disease of weak and fracture-prone bones. The characteristic of bone that is most predictive of fractures is low bone mineral density (BMD), a trait primarily controlled by genetics. In recent years, significant advances have been made in the discovery of genes affecting BMD; however, our understanding of its genetic basis is still primitive. In this study, we used genome-wide association in the mouse to identify additional sex combs like-2 (Asxl2) as a novel BMD gene. In confirmation of our genetic analysis, mice deficient in Asxl2 had reduced BMD. To evaluate its function in bone, the expression levels of Asxl2 and tens of thousands of other genes were measured in bone in a large number of inbred mouse strains. Asxl2 demonstrated a pattern of expression indicative of genes that play a critical role in osteoclasts, the cells that are responsible for bone resorption. Further study of Asxl2 may reveal novel therapeutic targets for the treatment and prevention of osteoporosis.
Upstream transcription factor 1 (USF1) has been associated with familial combined hyperlipidemia, the metabolic syndrome, and related conditions, but the mechanisms involved are unknown. In this study, we report validation of Usf1 as a causal gene of cholesterol homeostasis, insulin sensitivity and body composition in mouse models using several complementary approaches and identify associated pathways and gene expression network modules. Over-expression of human USF1 in both transgenic mice and mice with transient liver-specific over-expression influenced metabolic trait phenotypes, including obesity, total cholesterol level, LDL/VLDL cholesterol and glucose/insulin ratio. Additional analyses of trait and hepatic gene expression data from an F2 population derived from C57BL/6J and C3H/HeJ strains in which there is a naturally occurring variation in Usf1 expression supported a causal role for Usf1 for relevant metabolic traits. Gene network and pathway analyses of the liver gene expression signatures in the F2 population and the hepatic over-expression model suggested the involvement of Usf1 in immune responses and metabolism, including an Igfbp2-centered module. In all three mouse model settings, notable sex specificity was observed, consistent with human studies showing differences in association with USF1 gene polymorphisms between sexes.
Identifying variations in DNA that increase susceptibility to disease is one of the primary aims of genetic studies using a forward genetics approach. However, identification of disease-susceptibility genes by means of such studies provides limited functional information on how genes lead to disease. In fact, in most cases there is an absence of functional information altogether, preventing a definitive identification of the susceptibility gene or genes. Here we develop an alternative to the classic forward genetics approach for dissecting complex disease traits where, instead of identifying susceptibility genes directly affected by variations in DNA, we identify gene networks that are perturbed by susceptibility loci and that in turn lead to disease. Application of this method to liver and adipose gene expression data generated from a segregating mouse population results in the identification of a macrophage-enriched network supported as having a causal relationship with disease traits associated with metabolic syndrome. Three genes in this network, lipoprotein lipase (Lpl), lactamase β (Lactb) and protein phosphatase 1-like (Ppm1l), are validated as previously unknown obesity genes, strengthening the association between this network and metabolic disease traits. Our analysis provides direct experimental support that complex traits such as obesity are emergent properties of molecular networks that are modulated by complex genetic loci and environmental factors.
A key goal of biomedical research is to elucidate the complex network of gene interactions underlying complex traits such as common human diseases. Here we detail a multistep procedure for identifying potential key drivers of complex traits that integrates DNA-variation and gene-expression data with other complex trait data in segregating mouse populations. Ordering gene expression traits relative to one another and relative to other complex traits is achieved by systematically testing whether variations in DNA that lead to variations in relative transcript abundances statistically support an independent, causative or reactive function relative to the complex traits under consideration. We show that this approach can predict transcriptional responses to single gene–perturbation experiments using gene-expression data in the context of a segregating mouse population. We also demonstrate the utility of this approach by identifying and experimentally validating the involvement of three new genes in susceptibility to obesity.
A major task in dissecting the genetics of complex traits is to identify causal genes for disease phenotypes. We previously developed a method to infer causal relationships among genes through the integration of DNA variation, gene transcription, and phenotypic information. Here we validated our method through the characterization of transgenic and knockout mouse models of candidate genes that were predicted to be causal for abdominal obesity. Perturbation of eight out of the nine genes, with Gas7, Me1 and Gpx3 being novel, resulted in significant changes in obesity related traits. Liver expression signatures revealed alterations in common metabolic pathways and networks contributing to abdominal obesity and overlapped with a macrophage-enriched metabolic network module that is highly associated with metabolic traits in mice and humans. Integration of gene expression in the design and analysis of traditional F2 intercross studies allows high confidence prediction of causal genes and identification of involved pathways and networks.
Disruption of the elastic lamina, as an early indicator of aneurysm formation, and vascular calcification frequently occur together in atherosclerotic lesions of humans.
Methods and Results
We now report evidence of shared genetic basis for disruption of the elastic lamina (medial disruption) and medial calcification in an F2 mouse intercross between C57BL/6J and C3H/HeJ on a hyperlipidemic apolipoprotein E (ApoE−/−) null background. We identified 3 quantitative trait loci (QTLs) on chromosomes 6, 13, and 18, which are common to both traits, and 2 additional QTLs for medial calcification on chromosomes 3 and 7. Medial disruption, including severe disruptions leading to aneurysm formation, and medial calcification were highly correlated and occurred concomitantly in the cross. The chromosome 18 locus showed a striking male sex-specificity for both traits. To identify candidate genes, we integrated data from microarray analysis, genetic segregation, and clinical traits. The chromosome 7 locus contains the Abcc6 gene, known to mediate myocardial calcification. Using transgenic complementation, we show that Abcc6 also contributes to aortic medial calcification.
Our data indicate that calcification, though possibly contributory, does not always lead to medial disruption and that in addition to aneurysm formation, medial disruption may be the precursor to calcification.
aneurysm vascular calcification; Abcc6; Alox5; genetics; gene expression
Numerous quantitative trait loci (QTL) affecting bone traits have been identified in the mouse; however, few of the underlying genes have been discovered. To improve the process of transitioning from QTL to gene we describe an integrative genetics approach, which combines linkage analysis, expression QTL (eQTL) mapping, causality modeling and genetic association in outbred mice. In C57BL/6J X C3H/HeJ (BXH) F2 mice, nine QTL regulating femoral bone mineral density (BMD) were identified. To select candidate genes from within each QTL region, microarray gene expression profiles from individual F2 mice were used to identify 148 genes whose expression was correlated with BMD and regulated by local eQTL. Many of the genes that were the most highly correlated with BMD have been previously shown to modulate bone mass or skeletal development. Candidates were further prioritized by determining if their expression was predicted to underlie variation in BMD. Using network edge orienting (NEO), a causality modeling algorithm, 18 of the 148 candidates were predicted to be causally related to differences in BMD. To fine-map QTL, markers in outbred MF1 mice were tested for association with BMD. Three chromosome 11 SNPs were identified that were associated with BMD within the Bmd11 QTL. Finally, our approach provides strong support for Wnt9a, Rasd1 or both underlying Bmd11. Integration of multiple genetic and genomic data sets can substantially improve the efficiency of QTL fine-mapping and candidate gene identification.
Quantitative trait locus; bone mineral density; integrative genetics; genetic association; causality
The millions of common DNA variations that occur in the human population, or among inbred strains of mice and rats, perturb the expression (transcript levels) of a large fraction of the genes expressed in a particular tissue. The hundreds or thousands of common cis-acting variations that occur in the population may in turn affect the expression of thousands of other genes by affecting transcription factors, signaling molecules, RNA processing, and other processes that act in trans. The levels of transcripts are conveniently quantitated using expression arrays, and the cis- and trans-acting loci can be mapped using quantitative trait locus (QTL) analysis, in the same manner as loci for physiologic or clinical traits. Thousands of such expression QTL (eQTL) have been mapped in various crosses in mice, as well as other experimental organisms, and less detailed maps have been produced in studies of cells from human pedigrees. Such an integrative genetics approach (sometimes referred to as “genetical genomics”) is proving useful for identifying genes and pathways that contribute to complex clinical traits. The coincidence of clinical trait QTL and eQTL can help in the prioritization of positional candidate genes. More importantly, mathematical modeling of correlations between levels of transcripts and clinical traits in genetic crosses can allow prediction of causal interactions and the identification of “key driver” genes. An important objective of such studies will be to model biological networks in physiologic processes. When combined with high-density single nucleotide polymorphism (SNP) mapping, it should be feasible to identify genes that contribute to transcript levels using association analysis in outbred populations. In this review we discuss the basic concepts and applications of this integrative genomic approach to cardiovascular and metabolic diseases.
A wealth of genetic associations for cardiovascular and metabolic phenotypes in humans has been accumulating over the last decade, in particular a large number of loci derived from recent genome wide association studies (GWAS). True complex disease-associated loci often exert modest effects, so their delineation currently requires integration of diverse phenotypic data from large studies to ensure robust meta-analyses. We have designed a gene-centric 50 K single nucleotide polymorphism (SNP) array to assess potentially relevant loci across a range of cardiovascular, metabolic and inflammatory syndromes. The array utilizes a “cosmopolitan” tagging approach to capture the genetic diversity across ∼2,000 loci in populations represented in the HapMap and SeattleSNPs projects. The array content is informed by GWAS of vascular and inflammatory disease, expression quantitative trait loci implicated in atherosclerosis, pathway based approaches and comprehensive literature searching. The custom flexibility of the array platform facilitated interrogation of loci at differing stringencies, according to a gene prioritization strategy that allows saturation of high priority loci with a greater density of markers than the existing GWAS tools, particularly in African HapMap samples. We also demonstrate that the IBC array can be used to complement GWAS, increasing coverage in high priority CVD-related loci across all major HapMap populations. DNA from over 200,000 extensively phenotyped individuals will be genotyped with this array with a significant portion of the generated data being released into the academic domain facilitating in silico replication attempts, analyses of rare variants and cross-cohort meta-analyses in diverse populations. These datasets will also facilitate more robust secondary analyses, such as explorations with alternative genetic models, epistasis and gene-environment interactions.
Quantitative trait locus (QTL) analysis is a powerful tool for mapping genes for complex traits in mice, but its utility is limited by poor resolution. A promising mapping approach is association analysis in outbred stocks or different inbred strains. As a proof of concept for the association approach, we applied whole-genome association analysis to hepatic gene expression traits in an outbred mouse population, the MF1 stock, and replicated expression QTL (eQTL) identified in previous studies of F2 intercross mice. We found that the mapping resolution of these eQTL was significantly greater in the outbred population. Through an example, we also showed how this precise mapping can be used to resolve previously identified loci (in intercross studies), which affect many different transcript levels (known as eQTL “hotspots”), into distinct regions. Our results also highlight the importance of correcting for population structure in whole-genome association studies in the outbred stock.
In rodents, as in humans, traits such as obesity or diabetes are under the influence of many genes spread throughout the genome. Using linkage analysis, the locations of the major contributing genes can be mapped only to very large regions of chromosomes, usually encompassing hundreds of genes. This has made it difficult to identify the underlying genes and mutations. Another approach, analogous to genome-wide association in human populations, is to use association analyses among outbred stocks of mice. In this proof-of-principle article, we make use of common variations that locally perturb gene expression to demonstrate the greatly improved mapping resolution of association in mice. Our results indicate that association analyses in mice are a powerful approach to the dissection of complex traits and their underlying molecular networks.
Genetic variants that are associated with common human diseases do not lead directly to disease, but instead act on intermediate, molecular phenotypes that in turn induce changes in higher-order disease traits. Therefore, identifying the molecular phenotypes that vary in response to changes in DNA and that also associate with changes in disease traits has the potential to provide the functional information required to not only identify and validate the susceptibility genes that are directly affected by changes in DNA, but also to understand the molecular networks in which such genes operate and how changes in these networks lead to changes in disease traits. Toward that end, we profiled more than 39,000 transcripts and we genotyped 782,476 unique single nucleotide polymorphisms (SNPs) in more than 400 human liver samples to characterize the genetic architecture of gene expression in the human liver, a metabolically active tissue that is important in a number of common human diseases, including obesity, diabetes, and atherosclerosis. This genome-wide association study of gene expression resulted in the detection of more than 6,000 associations between SNP genotypes and liver gene expression traits, where many of the corresponding genes identified have already been implicated in a number of human diseases. The utility of these data for elucidating the causes of common human diseases is demonstrated by integrating them with genotypic and expression data from other human and mouse populations. This provides much-needed functional support for the candidate susceptibility genes being identified at a growing number of genetic loci that have been identified as key drivers of disease from genome-wide association studies of disease. By using an integrative genomics approach, we highlight how the gene RPS26 and not ERBB3 is supported by our data as the most likely susceptibility gene for a novel type 1 diabetes locus recently identified in a large-scale, genome-wide association study. We also identify SORT1 and CELSR2 as candidate susceptibility genes for a locus recently associated with coronary artery disease and plasma low-density lipoprotein cholesterol levels in the process.
Genome-wide association studies seek to identify regions of the genome in which changes in DNA in a given population are correlated with disease, drug response, or other phenotypes of interest. However, changes in DNA that associate with traits like common human diseases do not lead directly to disease, but instead act on intermediate, molecular phenotypes that in turn induce changes in the higher-order disease traits. Therefore, identifying molecular phenotypes that vary in response to changes in DNA that also associate with changes in disease traits can provide the functional information necessary to not only identify and validate the susceptibility genes directly affected by changes in DNA, but to understand as well the molecular networks in which such genes operate and how changes in these networks lead to changes in disease traits. To enable this type of approach we profiled the expression levels of 39,280 transcripts and genotyped 782,476 SNPs in 427 human liver samples, identifying thousands of DNA variants that strongly associated with liver gene expression. These relationships were then leveraged by integrating them with genotypic and expression data from other human and mouse populations, leading to the direct identification of candidate susceptibility genes corresponding to genetic loci identified as key drivers of disease. Our analysis is able to provide much needed functional support for these candidate susceptibility genes.
Identifying changes in DNA that associate with changes in gene expression in human tissues elucidates the genetic architecture of gene expression in human populations and enables the direct identification of functionally supported candidate susceptibility genes in genomic regions associated with disease.
Systems-oriented genetic approaches that incorporate gene expression and genotype data are valuable in the quest for genetic regulatory loci underlying complex traits. Gene coexpression network analysis lends itself to identification of entire groups of differentially regulated genes—a highly relevant endeavor in finding the underpinnings of complex traits that are, by definition, polygenic in nature. Here we describe one such approach based on liver gene expression and genotype data from an F2 mouse intercross utilizing weighted gene coexpression network analysis (WGCNA) of gene expression data to identify physiologically relevant modules. We describe two strategies: single-network analysis and differential network analysis. Single-network analysis reveals the presence of a physiologically interesting module that can be found in two distinct mouse crosses. Module quantitative trait loci (mQTLs) that perturb this module were discovered. In addition, we report a list of genetic drivers for this module. Differential network analysis reveals differences in connectivity and module structure between two networks based on the liver expression data of lean and obese mice. Functional annotation of these genes suggests a biological pathway involving epidermal growth factor (EGF). Our results demonstrate the utility of WGCNA in identifying genetic drivers and in finding genetic pathways represented by gene modules. These examples provide evidence that integration of network properties may well help chart the path across the gene–trait chasm.
Electronic supplementary material
The online version of this article (doi: 10.1007/s00335-007-9043-3) contains supplementary material, which is available to authorized users.
Systems biology approaches that are based on the genetics of gene expression have been fruitful in identifying genetic regulatory loci related to complex traits. We use microarray and genetic marker data from an F2 mouse intercross to examine the large-scale organization of the gene co-expression network in liver, and annotate several gene modules in terms of 22 physiological traits. We identify chromosomal loci (referred to as module quantitative trait loci, mQTL) that perturb the modules and describe a novel approach that integrates network properties with genetic marker information to model gene/trait relationships. Specifically, using the mQTL and the intramodular connectivity of a body weight–related module, we describe which factors determine the relationship between gene expression profiles and weight. Our approach results in the identification of genetic targets that influence gene modules (pathways) that are related to the clinical phenotypes of interest.
Obesity is a major pub lic health concern in many developed countries. While some people appear to stay lean no matter what or how much they eat, others appear to be genetically predisposed to obesity. The genetic similarity between mouse and human makes the mouse a promising mammalian model system to study obesity. Advantages of mouse models include the ability to control diet/environment and easy access to relevant tissues for gene expression studies. Mouse cross studies have implicated dozens of chromosomal regions that contain weight-predisposing genes, and gene expression studies have yielded hundreds of body weight–related genes. In this study, the authors use a gene network–based approach for integrating clinical traits, genetic marker data, and gene expression data. Instead of focusing on individual genes, the authors provide a systems-level view of a module of genes related to body weight. The resulting model allows them to characterize weight-related genes utilizing network concepts (intramodular connectivity) and genetic concepts (module quantitative trait locus). This integrative genomics approach provides new insights into the relationship between gene expression and body weight.
The integration of expression profiling with linkage analysis has increasingly been used to identify genes underlying complex phenotypes. The effects of gender on the regulation of many physiological traits are well documented; however, “genetical genomic” analyses have not yet addressed the degree to which their conclusions are affected by sex. We constructed and densely genotyped a large F2 intercross derived from the inbred mouse strains C57BL/6J and C3H/HeJ on an apolipoprotein E null (ApoE−/−) background. This BXH.ApoE−/− population recapitulates several “metabolic syndrome” phenotypes. The cross consists of 334 animals of both sexes, allowing us to specifically test for the dependence of linkage on sex. We detected several thousand liver gene expression quantitative trait loci, a significant proportion of which are sex-biased. We used these analyses to dissect the genetics of gonadal fat mass, a complex trait with sex-specific regulation. We present evidence for a remarkably high degree of sex-dependence on both the cis and trans regulation of gene expression. We demonstrate how these analyses can be applied to the study of the genetics underlying gonadal fat mass, a complex trait showing significantly female-biased heritability. These data have implications on the potential effects of sex on the genetic regulation of other complex traits.
Although their genomes are nearly identical, the males and females of a species exhibit striking differences in many traits, including complex traits such as obesity. This study combines genetic and genomic tools to identify in parallel quantitative trait loci (QTLs) for a measure of gonadal fat mass and for expression of transcripts in the liver. The results are used to explore the relationship between genetic variation, sexual differentiation, and obesity in the mouse model. Using over 300 intercross progeny of two inbred mouse strains, five loci in the genome were found to be highly correlated with abdominal fat mass. Four of the five loci exhibited opposite effects on obesity in the two sexes, a phenomenon known as sexual antagonism. To identify candidate genes that may be involved in obesity through their expression in the liver, global gene expression analysis was employed using microarrays. Many of these expression QTLs also show sex-specific effects on transcription. A hotspot for trans-acting QTLs regulating the expression of transcripts whose abundance is correlated with gonadal fat mass was identified on Chromosome 19. This region of the genome colocalizes with a clinical QTL for gonadal fat mass, suggesting that it harbors a good candidate gene for obesity.
In order to evaluate metabolic pathways associated with obesity, global gene-expression data were integrated with phenotypic and genetic segregation analyses, identifying 13 metabolic pathways the genes of which are coordinately regulated in association with obesity. Four genomic regions were found to control the coordinated expression of these pathways and novel genes potentially associated with the identified pathways were identified.
A segregating population of (C57BL/6J × DBA/2J)F2 intercross mice was studied for obesity-related traits and for global gene expression in liver. Quantitative trait locus analyses were applied to the subcutaneous fat-mass trait and all gene-expression data. These data were then used to identify gene sets that are differentially perturbed in lean and obese mice.
We integrated global gene-expression data with phenotypic and genetic segregation analyses to evaluate metabolic pathways associated with obesity. Using two approaches we identified 13 metabolic pathways whose genes are coordinately regulated in association with obesity. Four genomic regions on chromosomes 3, 6, 16, and 19 were found to control the coordinated expression of these pathways. Using criteria that included trait correlation, differential gene expression, and linkage to genomic regions, we identified novel genes potentially associated with the identified pathways.
This study demonstrates that genetic and gene-expression data can be integrated to identify pathways associated with clinical traits and their underlying genetic determinants.