|Home | About | Journals | Submit | Contact Us | Français|
The present study measured variation in body weight using a combined analysis in an F2 intercross and an F34 advanced intercross line (AIL). Both crosses were derived from inbred LG/J and SM/J mice, which were selected for large and small body size prior to inbreeding. Body weight was measured at 62 (±5) days of age. Using an integrated GWAS and forward model selection approach, we identified 11 significant QTLs that affected body weight on ten different chromosomes. With these results we developed a full model that explained over 18% of the phenotypic variance. The median 1.5-LOD support interval was 5.55 Mb, which is a significant improvement over most prior body weight QTLs. We identified nonsynonymous coding SNPs between LG/J and SM/J mice in order to further narrow the list of candidate genes. Three of the genes with nonsynonymous coding SNPs (Rad23b, Stk33, and Anks1b) have been associated with adiposity, waist circumference, and body mass index in human GWAS, thus providing evidence that these genes may underlie our QTLs. Our results demonstrate that a relatively small number of loci contribute significantly to the phenotypic variance in body weight, which is in marked contrast to the situation in humans. This difference is likely to be the result of strong selective pressure and the simplified genetic architecture, both of which are important advantages of our system.
Phenotypic variation in complex quantitative traits is attributed to combinations of genes, environmental factors, and their interactions with each other (Flint and Mackay 2009; Cheverud et al. 2010). Variation in body weight is linked to health disorders in both humans and agricultural animals (Bouchard 1991; Campfield and Smith 1999; Willet et al. 1999). Levi et al. (2010) describe numerous environmental influences contributing to the recent spike in body weight in humans, but strong evidence also indicates that body weight has a significant genetic component. Family and twin studies estimate that the heritable variation contributing to body weight ranges from 30 to 70% (Rankinen et al. 2006). More recently, genome-wide association studies (GWAS) have identified genetic variants that contribute to body weight, obesity, and body mass index (BMI) in human populations, yet each individual variant explains less than 0.05% of the heritable variation (Loos and Bouchard 2008; Stratigopoulos et al. 2008; Willer et al. 2009; Scherag et al. 2010; Speliotes et al. 2010). The discovery that these single-nucleotide polymorphisms (SNPs) account for only a tiny fraction of the genetic variation creates uncertainty regarding the ability of GWAS to identify the bulk of heritable variation for body weight. Reasons for this “missing heritability” are thought to be due in part to the presence of rare alleles, epistasis, and gene-by-environment interactions (Manolio et al. 2009).
Mouse models are complementary to human genetic studies and offer unique advantages, including the ability to control environmental variance, perform dangerous or invasive procedures, conduct well-defined crosses, functionally evaluate candidate genes in vivo or in vitro, and undertake rigorous mechanistic studies. Quantitative trait locus (QTL) studies have been successful in identifying chromosomal regions associated with body weight in mice, yet gene identification has remained elusive (Brockmann et al. 1998; Morris et al. 1999; Ishikawa and Namikawa 2004; Bennett et al. 2005; Neuschl et al. 2007). This is partially because QTL studies in mice have traditionally used recombinant inbred lines (RI), backcrosses, F2 intercrosses (F2), and similar strategies to identify QTLs that underlie phenotypic variability. Due to a lack of recombination, these techniques are able to identify only large genomic regions and are thus unsuitable for identifying the genes that underlie QTLs (Peters et al. 2007; Flint 2011).
This serious limitation can be addressed by using populations with greater numbers of accumulated recombinations, such as advanced intercross lines (AILs) (Darvasi and Soller 1995; Parker and Palmer 2011). AILs are produced by randomly mating many individuals beyond the F2 generation. These additional breeding generations produce additional recombinations, which allows for the more precise identification of QTL regions. Because AILs are derived from two inbred founders, they maintain the simplicity of more traditional crosses. The F2 and F34 AIL used in the present study are derived from the Large (LG/J) and Small (SM/J) inbred mouse lines originated from mice selected for large or small body size at 60 days of age, respectively (Goodale 1938; MacArthur 1944). After both lines were fully inbred, they displayed a 24-g difference in body weight at 60 days of age (Chai 1956). These strains have been studied extensively as genetic models of body weight and obesity-related traits (Cheverud et al. 2001; Ehrich et al. 2003, 2005a, b; Fawcett et al. 2008, 2010). The extreme phenotypic variability between these strains makes them ideally suited for the identification of genetic influences on body weight since they are expected to possess many segregating alleles that confer differences.
In the present study, we use the power of LG/J × SM/J F2 mice and the precision of LG/J × SM/J F34 AIL mice in conjunction with a forward model selection procedure to identify and fine-map loci associated with body weight. Knowledge about the genes identified in these regions may have many applications, including improvement of farm animal meat quality and breeding procedures or the development of treatments for growth- and obesity-related disorders in humans. In addition, from a technical perspective, the use of a highly recombinant mouse population in conjunction with our forward model selection QTL mapping procedure has broader applications for the analysis of complex traits.
All procedures were approved by the University of Chicago Institutional Animal Care and Use Committee (IACUC) in accordance with NIH guidelines. Details regarding the mice and genotypes used in the present study have been described previously (Cheng et al. 2010; Lionikas et al. 2010; Samocha et al. 2010). We obtained inbred male SM/J and female LG/J mice from the Jackson Laboratory (Bar Harbor, ME). These mice were used to produce LG/J × SM/J F1 mice, which were then bred to create the F2 generation (n = 488, 239 females and 249 males).
In addition, we obtained 140 F33 breeders from the laboratory of Dr. James Cheverud (Washington University, St. Louis, MO). The F33 mice were outbred, with more than 50 families having been maintained per generation since their inception. Breeding was random except that siblings were not mated with one another. We chose to study the SM/J and LG/J strains mainly because they were selected to have high and low body weight and so were expected to segregate many relevant loci, and because of the availability of an F33 AIL, which represents almost 10 years of breeding. Records from Dr. Cheverud’s lab allowed us to construct a pedigree for each F33 mouse that traced back to the original inbred founders. From these 140 F33 mice, 119 were successfully bred to create an F34 generation (n = 701, 343 females and 358 males) in which phenotypes were measured. We produced only one F34 litter per breeding pair. Breeding pairs were rotated after each litter in order to avoid producing large numbers of full sibs; however, the phenotyped (F34) generation inevitably contained many sibs, half-sibs, and cousins as well as more distant and complex relationships. All F2 and F34 mice were housed in standard laboratory conditions with a 12:12-h light cycle and ad libitum access to standard lab chow and water.
Mice were weighed when they were approximately 2 months old (mean age = 62 days, SD = 5 days) at the same time of day during the light phase of the day using a Fisher Scout II scale; weights were rounded to the nearest 0.1 g. These measurements were taken as part of a behavioral study investigating methamphetamine sensitivity (Cheng et al. 2010), but were obtained before any drug was administered. Additional data collected from these mice after the measurement of body weight have also been published (Lionikas et al. 2010; Samocha et al. 2010). Our study uses a novel forward selection technique and investigates body weight QTLs, which were not studied in any of the previous papers.
Genotyping was performed as previously described (Cheng et al. 2010). Briefly, 162 evenly spaced SNPs were used as markers in the F2 mice (Petkov et al. 2004). For the F34 mice, we designed a custom SNP array that assayed SNPs using the Illumina Infinium Platform (http://www.illumina.com). SNPs were chosen to provide uniform coverage of the mouse genome and contained ~4,000 markers that were polymorphic between LG/J and SM/J strains. A full list of these SNPs is available at the JAX Phenome website under the name “Chicago1” (http://phenome.jax.org/db/q?rtn=projects/detailsandsym=Chicago1). We performed genome-wide association analysis in the combined population of the F2 and F34 intercrosses using the R package QTLRel, which is available from CRAN (http://cran.r-project.org/web/packages/QTLRel/index.html). This software allowed us to account for the complex relationships (e.g., sibling, half-sibling, cousins) among the F34 mice by using a mixed model as described previously (Cheng et al. 2010). Because of well-known effects of sex on body weight, we explored genetic models where sex was included as an additive covariate.
The initial genome scan identified QTLs on nine chromosomes that contributed to variation in body weight, with multiple peaks on the same chromosome. After the initial scan, we fit our model with all of the identified QTLs, sequentially performed a series of tests, and removed QTLs that were not significant given other QTLs in the model. For model selection, we used Akaike’s (1974) information criterion (AIC):
where log(M) is the log-likelihood of model M under consideration, |M| is the number of parameters in the model M, and k = 2. Markers were tested using a 2-degree-of-freedom test which models both additive and dominant QTL effects. Instead of using the classical AIC in which k = 2, we chose k to be half the 0.05 genome-wide threshold. This extended AIC is called BICδ by Broman (2002) and posits that the chance of selecting any QTL by the criterion will be 0.05 if there is actually no QTL.
We wanted to select the multi-QTL model that produced the minimum AIC value. Because the number of loci tested was large, it was impractical to search through the whole model space to select the optimal model. Therefore, we adopted two well-known model search strategies, forward selection and backward elimination. First, we performed a forward selection, starting with the model that included no QTL. Next, a genome scan was performed and the locus that resulted in the smallest AIC was added to the model. A second genome scan was conducted while including the previously selected QTL in the model. This procedure was repeated until no loci had sufficiently large AIC to be added to the model. The forward selection method identified 11 QTLs, two of which were on chromosome 6. These results were broadly similar to the results obtained when we performed a single QTL scan; however, the forward selection procedure identified a QTL on chromosome 14 that was not significant when using a single QTL scan, which demonstrates the advantage of our approach.
After generating this model, we further refined the locations of these 11 QTLs by moving them to nearby locations that reduced the AIC. This was done by use of a coordinate descent algorithm (Nocedal and Wright 1999). For this we cyclically moved each of the 11 identified QTLs around its linkage region while keeping the locations of other QTLs constant. The location of each QTL was updated with the location that provided the smallest AIC; this procedure is similar to one previously described by Zeng et al. (1999).
Finally, we performed backward elimination to see if any QTL should be excluded from the model. The rationale is that the contribution of a QTL depends on other QTLs in the model and forward selection can result in extraneous QTLs (Broman 2002). This procedure did not remove any of the QTLs. We defined the confidence interval for each QTL as the 1.5-LOD dropoff on either side of the peak. This interval was expressed in physical map position (Mb) by using the genotyped SNP that was at or beyond the 1.5-LOD support interval.
We used the following model to identify QTLs; for the ith individual
where xi represents covariates; ai,k = 1, 0, or −1 if the genotype at the kth QTL is AA, Aa, or aa; di,k = 1 or 0 if the genotype at the kth QTL is heterozygous or homozygous; β’s are the corresponding effects; gi is the polygenic effect; and εi is the residual effect. Furthermore, assume (g1, g2, …, gn) ~ N(0, Σ) and (ε1, ε2, …, ε3) ~ N(0, Iσ2). The model that we arrived at predicted body weight (yi) as follows:
where xsex = 1 or 0 if the ith individual is male or female and xage is its age in days.
Sequences for LG/J and SM/J inbred mice were provided by Dr. Jim Cheverud from The Genome Sequencing Center at Washington University (http://genome.wustl.edu) and are described in detail elsewhere (Norgard et al. 2011). Briefly, the sequencing data identified over 4 million autosomal polymorphisms between LG/J and SM/J inbred mice, and all SNPs used in the study are available at dbSNP (http://www.ncbi.nlm.nih.gov/projects/SNP/). We used these data to identify nonsynonymous coding SNPs within our QTL regions.
On average, F2 mice weighed 25.01 g (SD = 5.48 g). F34 mice weighed an average of 26.21 g (SD = 5.67 g); these differences were not significant. As expected, there was a highly significant effect of sex on body weight (p < 0.0001). Figure 1 displays the distribution of body weight, split by generation and sex.
We performed multiple-QTL model selection using forward selection and backward elimination on the integrated LG/J × SM/J F2 and LG/J × SM/J F34 populations. The resulting full model consisted of 11 QTLs on chromosomes 1, 2, 4, 6 (2 loci), 7, 8, 9, 10, 11, and 14 (Supplementary Fig. 1). Table 1 displays the confidence intervals, peak LOD scores, number of genes in the interval, and percent variance explained for each QTL. The 1.5-LOD support interval for body weight QTLs ranged between 1.82 and 18.79 Mb, with a median interval length of 5.55 Mb. The number of annotated genes within these intervals ranged from 5 to 109, with a median of 39 genes.
By comparing the sequences of LG/J and SM/J mice, we identified nonsynonymous coding SNPs in the 1.5-LOD support intervals of the QTLs and narrowed the list of candidate genes (Supplementary Table 3). The number of genes with nonsynonymous coding SNPs within each interval ranged from 0 to 52, with a median of 4 genes. Five of the QTL intervals contained fewer than 5 genes with nonsynonymous coding SNPs. Three of the genes that we identified (Rad23b, Stk33, and Anks1b) have been associated with adiposity and BMI in human GWAS studies (Willer et al. 2009; Croteau-Chonka et al. 2010).
We performed genome-wide mapping of QTLs affecting body weight in an F2 and F34 AIL population of mice derived from strains selected for high and low body weight. We used a forward selection and backward elimination method to identify 11 QTLs affecting body weight. This procedure identified one additional QTL that was not significant when standard single QTL mapping was performed. As expected, all LG/J alleles were found to be associated with higher body weight. The integration of the F2 and F34 populations provided good power (F2) and good resolution (F34); the median resolution was 5.55 Mb, with four loci <3 Mb.
Traditionally, F2 intercrosses are used to identify QTLs underlying phenotypic variation, and fine-mapping is carried out as a second step using congenic strains or other fine-mapping tools. This time- and labor-intensive effort at subsequent dissection and gene identification is often derailed by the discovery that a single QTL of large effect is in fact caused by multiple loci of small effect located in the same chromosomal region (Legare et al. 2000; Mott et al. 2000; Cheng et al. 2010; Shao et al. 2010). An AIL is an improvement over these traditional methods because it merges identification and fine-mapping into a single step, which can often discriminate between loci that are due to single versus multiple alleles (Darvasi and Soller 1995). One drawback is that the power to detect QTLs in AILs is lower than in F2 populations. This is because the AILs have greater amounts of recombination than the F2, so more tests are performed and a corresponding higher threshold is needed to control type I errors. The integration of the F2 and F34 AIL populations combines the detection power of the F2 with the precision of the F34 AIL.
After identifying 11 QTLs associated with body weight that accounted for approximately 18% of the phenotypic variance, we then used the obesity gene map (http://www.obesitygenes.org), the mouse genome informatics database (http://www.informatics.jax.org/), and the published literature on body weight phenotypes to identify approximately 60 other body weight QTLs that shared overlapping confidence intervals with our regions of interest (Supplementary Table 1). While it is likely that some of the QTLs identified in our study are the same as those identified by other researchers, in most cases we have mapped QTLs with far greater precision than earlier body weight studies, which have a median QTL interval of 31.6 Mb. Interestingly, our QTLs also overlapped regions in other mouse populations associated with phenotypes such as obesity, bone mineral density, limb length, and organ weights (Supplementary Table 2). This may indicate the influence of a single gene on multiple phenotypic traits (pleiotropy), or may simply reflect the fact that body weight is a composite trait consisting of the weights of muscle and fat compounds as well as those of organs, bones, and body fluids (Brockmann et al. 1998). For example, Cheverud et al. (2010) reported that 20% of the weight of an animal in the LG/J × SM/J population is dissectible fat, and in some cases this proportion can approach 50%. While many of our body weight QTLs overlapped with regions associated with organ weights and fat percentages, only 1 of the 11 body weight QTLs overlapped with the muscle weight QTLs identified by Lionikas et al. (2010) in the same population of mice, suggesting that these two phenotypes may have different underlying genetic influences. Still, a limitation of our study design was that we considered only total body weight rather than the weight of specific components that may be driving the observed variation in body weight.
Using sequence data generously provided by Dr. Cheverud, we were able to examine more closely the candidate genes within the QTL intervals and search for polymorphic SNPs between strains (Supplementary Table 3). Three of the QTLs (BodWt4, BodWt7, and BodWt10) contained genes (Rad23b, Stk33, and Anks1b, respectively) with nonsynonymous coding SNPs that have been associated with adiposity, waist circumference, and BMI in human GWAS studies (Willer et al. 2009; Croteau-Chonka et al 2010). A knockout mouse exists for Rad23b that displays disruptions in adipose tissue, endocrine and exocrine glands, growth, size, and metabolism (Ng et al. 2002). The GWAS finding, the existence of a knockout mouse, and the fact that Rad23b is the only gene in the BodWt4 QTL interval with a nonsynonymous coding SNP makes it an especially promising candidate gene for follow-up studies. However, it is important to note that the polymorphisms underlying the observed trait variance may be due to differences in gene expression rather than in protein-coding genes. Indeed, one of our QTLs, BodWt6a, did not contain any genes with nonsynonymous coding SNPs in exonic regions. It is possible that this QTL and the others are due to SNPs in promoter or enhancer regions that gave rise to expression QTLs (eQTLs) resulting in differences in gene expression that underlie body weight QTLs. Loci identified by human GWAS are enriched for eQTLs, suggesting that the latter may cause the former (Nicolae et al 2010). Availability of genome-wide eQTL data in the LG/J and SM/J strains will greatly aid in the identification of specific genes underlying these QTLs.
Our study has several important limitations. First, because we have used a cross between two inbred strains, we are studying the alleles that segregate between them and not the total universe of alleles that segregate among other laboratory or wild mice. We did observe significant overlap in the QTLs we identified in our population with QTLs identified in other populations of mice (Supplementary Table 1), which is consistent with the idea that laboratory mice are segregating a relatively limited number of alleles (Yang et al. 2007). Additionally, we considered only body weight, but not body size or composition, and we examined only one developmental time point and one diet condition. Other studies have provided evidence that different genetic loci may affect body weight at different developmental stages, sexes, and diets (Cheverud et al. 2010; Lawson et al. 2011). Despite these limitations, the QTLs we identified showed significant overlap with QTLs identified by other researchers at a variety of ages, ranging from 21 to 252 days (Supplementary Table 1), as well as with the results of human GWAS studies (Willer et al. 2009; Croteau-Chonka et al 2010). Finally, we did not consider parent-of-origin effects, which are known to be important for body weight in these strains (Cheverud et al. 2010).
In conclusion, we have mapped a large number of body weight QTLs using a novel multiple-QTL mapping procedure and forward selection model in an AIL. This has allowed us to determine which QTLs contribute significantly to variation in body weight given the existence of other QTLs in the model. Some of the QTLs we identified correspond to regions identified by other researchers, yet in the majority of cases, we have narrowed the confidence intervals quite significantly compared to previous studies. In our study we have observed a relatively simple genetic architecture, where a significant fraction of phenotypic variation can be explained by a small number of loci; this is in contrast to efforts to identify similar loci in humans and reflects a strength of our approach. The use of a forward model selection procedure allowed us to identify an additional locus compared to a single-QTL analysis. Furthermore, the combination of high-resolution mapping and sequence data offers a powerful approach and permitted identification of several candidate genes that may underlie differences in body weight. In summary, AILs allow GWAS to be performed in a situation where all alleles are common and where uniform environmental conditions can be maintained, which limits the interactions between genes and environment. These advantages allowed us to map QTLs with a modest sample size and identify small regions that warrant further molecular evaluation.
We thank Dr. James Cheverud and Dr. Heather Lawson for generously providing the F33 mice used to create the F34 and for the sequence data in the LG/J and SM/J mice. This work was supported by a grant from the Schweppe Foundation and by NIH grants MH079103, DA07255, DA024845, and DA021336.
Electronic supplementary material The online version of this article (doi:10.1007/s00335-011-9349-z) contains supplementary material, which is available to authorized users.
Clarissa C. Parker, Department of Human Genetics, University of Chicago, 920 E 58th St., CLSC-507D, Chicago, IL 60637, USA.
Riyan Cheng, Department of Human Genetics, University of Chicago, 920 E 58th St., CLSC-507D, Chicago, IL 60637, USA.
Greta Sokoloff, Department of Human Genetics, University of Chicago, 920 E 58th St., CLSC-507D, Chicago, IL 60637, USA.
Jackie E. Lim, Departments of Pharmacology and Cancer Biology, Duke University School of Medicine, Durham, NC 27710, USA.
Andrew D. Skol, Department of Medicine, Section for Genetic Medicine, University of Chicago, Chicago, IL 60637, USA.
Mark Abney, Department of Human Genetics, University of Chicago, 920 E 58th St., CLSC-507D, Chicago, IL 60637, USA.
Abraham A. Palmer, Department of Human Genetics, University of Chicago, 920 E 58th St., CLSC-507D, Chicago, IL 60637, USA. Departments of Psychiatry and Behavioral Neuroscience, University of Chicago, Chicago, IL 60637, USA.