Although genome-wide association studies (GWAS) have resulted in the discovery of thousands of novel associations of loci to hundreds of phenotypes 
, concerns have been raised about the finding that these loci appear to explain a relatively small proportion of the estimated heritability, the fraction of phenotypic variation in a population that is due to genetic variation 
. This has led to considerable speculation by researchers about the genetic basis of complex human phenotypes and the “missing heritability”, i.e. the fraction of heritability not accounted for by the associations discovered to date 
. Among the proposed explanations for missing heritability is the existence of many presently unidentified common variants with small effect sizes, rare variants not captured by current genotyping platforms, structural variants, epistatic interactions, gene-environment interactions, parent-of-origin effects, or inflated heritability estimates 
. Studies that examine the sources of missing heritability can help researchers to evaluate the prospects of future studies focusing on common versus rare variation and thereby devise effective strategies to discover the remaining sequence variants that affect disease risk and other aspects of phenotypic variation in humans.
The narrow-sense heritability of a phenotype (
) is the fraction of phenotypic variance that can be described by an additive model over the set of SNPs that are functionally related to the phenotype (i.e. the causal SNPs) 
. It is commonly estimated by comparing the phenotypic correlation of monozygotic (MZ) to that of dizygotic (DZ) twins. The difference between
and the fraction of phenotypic variance accounted for by variants discovered by means of GWAS (
) is the so-called missing heritability. Recently, Yang et al 
developed a method to estimate the variance explained by all SNPs on a genotyping platform including those that are not genome-wide significant (
), representing the limit of
for infinite sample size.
There are two major challenges in comparing
to quantify missing heritability. First, there is the potential for inflation of
estimates based on closely related individuals such as MZ/DZ twins. It is well known that epistatic interactions can inflate heritability estimates in studies of related individuals 
. Recent work from Zuk et al 
has examined this in detail. Other factors that could also lead to inflated estimates of
using closely related pairs of individuals include dominance and shared environment. Second, there is a tradeoff between inflation and sampling variance when estimating
. The recent variance component approach described by Yang et. al results in inflated estimates of
in the presence of related individuals 
. However, removing related individuals reduces the sample size, resulting in a larger standard error around the estimate 
. Both of these issues can adversely affect estimates of missing heritability.
Here, we analyze the heritability of 23 complex phenotypes in an Icelandic cohort of 38,167 individuals, leveraging both a population-wide genealogical database and genotype data from over 300,000 SNPs that have been long-range phased across and between chromosomes (i.e. where not only the phase, but also the parental origin of alleles has been determined) 
. Importantly, we develop an approach that allows
to be estimated on the basis of both closely and distantly related pairs of individuals. We find, for all of the quantitative phenotypes, that our estimates of
are smaller than those from the literature that were based on MZ/DZ twins 
. Our results indicate that previous estimates were inflated by the impact of epistasis or shared environment.
We further introduce a new variance components method that provides simultaneous estimates of
. This method has two principal advantages. First, by adequately taking account of both closely and distantly related pairs of individuals, it minimizes the standard error of the estimates, whilst avoiding the upward bias that can result from calculations based on closely related pairs. Second, it produces both estimates of heritability for the same population sample, ensuring that
are directly comparable.
For most of the 23 phenotypes examined here, our results show that
accounts for more than half of
. As GWAS have not identified many SNPs with large effect sizes (i.e.
is small), and
is greater than
by a considerable margin, it follows that there must be many associated sequence variants that remain to be discovered, i.e. these phenotypes are highly polygenic. Currently, only common variants are well captured by the genotyping arrays used in most GWAS studies. As the difference between
is likely due to common and rare variants not captured by the genotyping array 
, it may be assumed that a fair number of association signals remain to be identified through more comprehensive approaches, such as whole genome-sequencing. However, our estimates of
show that GWAS genotyping arrays capture a greater proportion of
than indicated by previous twin-based estimates of