The success of the ‘positional cloning' paradigm utilizing family-based linkage designs used in gene identification for Mendelian traits has yet to be paralleled for complex traits, where solid evidence for genetic variants has been extremely difficult to establish. Likely, reasons for these shortcomings are (1) insufficient sample sizes and (2) heterogeneity from population to population with respect to either environmental or genetic factors. In this study, we aim to locate QTL underlying the observed variation in stature and BMI and overcome these hurdles by (1) maximizing the sample size by combining the primary genome-wide genotype data of four independent family-based studies and (2) dissecting genetic and environmental heterogeneity by also performing sex- and population-specific linkage analyses. Using this approach, we have performed, to our knowledge, the largest family-based genome-wide screen for stature and BMI, which is not based on meta-analysis, and report two loci linked to variation in BMI and six loci linked to variation in stature most of which add evidence to previously reported loci for these traits.
The genetic background of the continuous variation observed for stature and BMI may be either oligo- or polygenic in nature. There is some supporting evidence for major genes underlying stature and BMI in humans,35
although solid proof is still elusive perhaps barring some exceptions such as the HMGA2 gene for stature30
as well as a variant upstream of the INSIG236, 37
and FTO gene for BMI. These recent GWA results that have demonstrated QTL explain no more than 1% of the trait variance.30, 38
Mapping QTL of such small effects by linkage analysis is likely to require unrealistic sample sizes, while GWA studies may succeed. Therefore, it is not surprising that in our linkage study, we do not detect the HMGA2 region associated to stature or the FTO region associated to BMI.
The existence of major genes is critical for successful genetic mapping at least in out bred populations.39
We believe that our results do support Fisher's infinitesimal model, where in the human population the genetic background of stature and BMI is controlled by a large number of genes each having a minute effect on the phenotype. In our opinion, this would at least to some extent explain the relatively modest statistical evidence for linkage observed in this study as well as the lack of consistent findings in other studies as well (see http://www.genomeutwin.org/stature_gene_map.htm
for overview). Another factor that may have resulted in false-negative findings in this study and previous comparable studies is that many traditional linkage-based genome-wide screens contain a relatively low proportion of inheritance information due to relatively sparse genetic maps (>5
cM) and missing founder genotypes. Regenotyping these family samples utilizing high-density single-nucleotide polymorphism map to increase inheritance information has been shown to be a successful strategy in simulation40
and empirical studies.41
Family-based linkage studies are based on examining patterns of allele sharing within individual families and then summing up these results across the study sample. Therefore, they are less liable to allelic heterogeneity than association studies and might identify genes for a trait where differing variants may contribute to the trait variance in different populations. In such circumstances even a dense GWA study might miss the signal, especially if the direction of effect is different for allelic variants. Thus, although there are obvious technological advantages in GWA studies, there is still room for traditional linkage studies4
in identification of biologically important genes for traits such as stature and BMI, where allelic heterogeneity across populations is quite probable. Another important issue in study design arises from the fact that family-based linkage mapping and association mapping in unrelated individuals are optimal under very different genetic models, and therefore it is unwise to invest solely on one or the other, as we do not know a priori the genetic architecture of the trait we are interested in. Simplistically speaking linkage studies are geared for relatively rare alleles with large effects within families (that may be of little effect in the population), whereas association studies are designed to detect common genetic effects that have smaller effects. In the case of rare monogenic disease, multiple rare variants at linked loci (allelic heterogeneity) seem to be the rule not the exception.42
For common polygenic disease and quantitative traits this question is still unanswered – there are examples for both common43
and rare alleles,44
and theoretical and empirical studies suggest a role for both rare and common variants.45
One must also remember that GWA can be performed in family samples, although utilizing unrelated samples is more straight-forward and powerful. However, due to the other beneficial qualities inherent to family samples, investigators already in possession of family samples should use them for GWA studies, as loss of statistical power is relatively small.46
Considering these and other examples, it is clear that genome-wide linkage mapping in families and GWA mapping in population cohorts should rather be considered as complimentary, not alternative, strategies in mapping polygenic traits.
Body mass index is a derived variable and thus dependent on both height and weight and used mainly as a means for classifying people as underweight, normal or obese. Although BMI does not provide specific information on physiological intermediate phenotypes, such as basal metabolic rate, its widespread use clinically, ease of measurement, high levels of reproducibility and significant heritability across populations make it an important phenotype for genetic analyses. Interest in this phenotype is exemplified by the Obesity Gene Map (http://obesitygene.pbrc.edu
), which is currently reporting 169 linkages and 183 associations to BMI from various studies, across all human chromosomes. Although there are probably several true signals among those reported, it is plausible to assume that all findings are not due to the obvious polygenic background of BMI, but many are just reflections of analyzing a variable that signals different biological backgrounds in different ascertainment schemes, populations and cohorts. The modest LOD scores we observed for BMI despite the large study sample likely reflects the heterogeneity of our study populations, and may suggest that there are relatively few common loci with strong effects for BMI across these populations. From the results of our analysis, we conclude that there is marked locus heterogeneity between males and females as well as African-American and European-American cohorts. This is evident from the fact that some loci are linked only in sex-specific (BMI 7q35 and stature loci on 12q and 18q) and/or population-specific (BMI loci on 7q and 11q and stature loci on 11q and 15q25) analyses. The relatively large sample size of even subgroups reduced, but cannot eliminate, the probability that the observed linkages in these stratified analyses were spurious.
For some loci the benefit of combining a large number of families resulted in increased statistical significance; for example, the African-American and European-American families provide consistent evidence for linkage at the stature loci on 18q and 19q13.
It is well established that in the setting of genome-wide screening for trait loci combining the primary data from independent genome-wide screens is superior to meta-analytic approaches in terms of power to detect loci and reducing sources of variation.47
However, combining data across multiple cohorts may also increase genetic and environmental heterogeneity and thus hamper successful locus identification.48
Our results show both the benefit of (1) combining data to maximize the sample size as well as (2) minimizing heterogeneity by analyzing subgroups where within-group variation can be reduced. Our results however suggest that the latter may be a more fruitful approach in genetic mapping. This approach is analogous to utilizing special populations such as population isolates to reduce genetic and environmental heterogeneity in gene mapping efforts.