Obesity has become a major worldwide challenge to public health, due to the Western ‘obesogenic’ environment interacting with a strong genetic contribution1. Recent extensive genome-wide association studies (GWAS) have identified numerous single nucleotide polymorphisms (SNPs) associated with obesity, but these loci together account for only a small fraction of the known heritable component1. Thus, the “common disease, common variant” paradigm is increasingly under challenge2. We report a highly-penetrant form of obesity, initially observed in 31 subjects who were heterozygous for deletions of at least 593kb at 16p11.2 and whose ascertainment included cognitive deficits. Nineteen similar deletions were identified from GWAS data in 16053 individuals from 8 European cohorts. Such deletions were absent from healthy non-obese controls and accounted for 0.7% of our morbid obesity cases (body mass index, BMI ≥ 40 kg.m−2 or BMI standard deviation score ≥ 4; p = 6.4×10−8, OR = 43.0), demonstrating the potential importance in common disease of rare variants with strong effects. This highlights a promising strategy for identifying missing heritability in obesity and other complex traits: Cohorts with extreme phenotypes are likely to be enriched for rare variants, thereby improving power for their discovery. Subsequent analysis of the loci so identified may well reveal additional rare variants that further contribute to the missing heritability, as recently reported for SIM13. Thus, the most productive approach may be to combine the “power of the extreme”4 in small, well-phenotyped cohorts, with targeted follow-up in GWAS and population cohorts.
The extent to which copy number variants (CNVs) might contribute to the missing heritability of common disorders is currently much under debate2. Since the majority of common simple CNVs are well-tagged by SNPs, it has recently been suggested that common CNVs are unlikely to contribute substantially to the missing heritability5. However, rare variants or recurring CNVs that have arisen on multiple independent occasions are unlikely to be captured by SNP tagging, and their identification will require alternative approaches.
We have previously hypothesised that cohorts with extreme phenotypes that include obesity may be enriched for rare but very potent risk variants4,6. Here we have investigated 312 subjects, from three centres in the UK and France, presenting with congenital malformations and/or developmental delay in addition to obesity as previously defined6,7 (see Methods). Known syndromes (e.g. Prader-Willi, fragile X etc.) were excluded. A combination of array comparative genomic hybridisation (aCGH), genotyping arrays, quantitative PCR (qPCR) and multiplex ligation-dependent probe amplification (MLPA) was used to identify and confirm the presence of a heterozygous deletion on 16p11.2 in 9 individuals (2.9%). Such deletions, estimated to be a total of 740kb in size (one copy of a segmental duplication plus 593kb of unique sequences, Figure 1a), have previously been associated to varying extents with autism, schizophrenia and developmental delay8-11; however, the observed frequency of deletions in our cohort is appreciably higher than the reported frequencies in the cohorts from the previous studies (<1%), which did not include obesity as an inclusion criterion.
A parallel, independent survey of aCGH and SNP-CGH data from 8 cytogenetic centres in France, Switzerland and Estonia, of 3,947 patients with developmental delay and/or malformations, but this time without selection for obesity, revealed 22 unrelated cases with similar deletions (0.6%). This is a frequency consistent with the previous studies8-11, but is significantly lower than for the above cohort which included only obese subjects (p=2.2×10−4, Fisher’s exact test).
Analysis of the available clinical data for these 22 new carriers indicated that, in addition to the ascertained cognitive deficits or behavioural abnormalities (including hyperphagia, specifically identified in at least 9 cases; see Supplementary Table S1), a 16p11.2 deletion gave rise to a strongly-expressed obesity phenotype in adults, with a more variable phenotype in childhood. All 4 teenagers and adults carrying a deletion were obese, while child carriers were also frequently either obese (4/15) or overweight (2/15), a tendency that has previously been noted11; the very young (under 2 years) were of normal weight. This age-dependent penetrance was observed for all instances of deletions where phenotypic data were available, whether from this study or from previously published reports10-15, and regardless of ascertainment (Figure 2; see Supplementary Tables S2 and S3).
Taken together, the data from these parallel studies suggest a possible direct association of deletions at 16p11.2 with obesity, distinct from their cognitive phenotype. Also identified in these cohorts were instances of the reciprocal duplication, which has also been implicated in neurodevelopmental disorders, but with a variable phenotype and lower penetrance9,10,12. The frequency of the duplication in the two cohorts (12/4183, 0.3%) was consistent with previous reports for patients with cognitive deficits (0.3–0.7%)10,12. Carriers of the duplication were neither obese nor had reported hyperphagia.
To further investigate the association of 16p11.2 deletions with obesity, and to estimate the extent to which it is observed independently of ascertainment for neurodevelopmental symptoms, we carried out algorithmic and statistical analyses of genome-wide SNP genotyping data (see Table 1) from Swiss (CoLaus16), Finnish (NFBC6617) and Estonian (EGPUT18) general population cohorts (11,856 subjects in total), from child obesity and adult morbid obesity case-control cohorts6,19,20 (1,224 and 1,548 subjects respectively), from an extreme early-onset obesity cohort (SCOOP, 931 subjects) and from 141 patients undergoing bariatric weight-loss surgery (see online Methods); in total, we identified 17 instances of deletions (and 4 duplications) with no significant gender bias (Table 1). In addition, we identified 2 further unrelated carriers of a deletion from amongst 353 members of 149 families with sibling pairs discordant for obesity (SOS Sib Pair Study21). Where DNA was available for further analysis (15/19 samples), the presence of a deletion was validated using MLPA (Figure 1b) or qPCR; the remaining deletions were validated by applying a second independent algorithm to the data. With the exception of a single individual who is apparently diabetic (fasting glucose > 7 mmol/L), all adult carriers of such deletions were obese, the majority being morbidly obese; similarly, each of the 7 child/adolescent carriers had a BMI in the top 0.1% of the population range for their age and gender. None of the individuals ascertained on the basis of their obesity had any reported developmental delay or cognitive deficit; four subjects were reported as having hyperphagia.
To enable sufficient statistical power to give robust conclusions, we combined data from the population and obesity cohorts in an overall case-control association analysis (the samples from sib pair families were excluded to avoid complications due to their relatedness). Compared to lean/normal weight subjects (see Table 1 and Methods), 16p11.2 deletions were associated with obesity (p = 5.7×10−7, Fisher’s exact test; odds ratio = 29.8, 95% confidence limits = 4.0, 225) and morbid obesity (p = 6.4×10−8; OR = 43.0 [5.6, 329]) at or near genome-wide levels of significance. Expanding the control group to include all non-obese individuals increased the significance to p = 4.1×10−9 (obese) and p = 6.1×10−10 (morbidly obese).
Previous reports have indicated that these deletions are frequently not inherited from either parent but arises de novo, possibly by non-allelic homologous recombination between the >99% sequence identical segmental duplications flanking the deleted region11,14. Therefore, where possible we investigated the parents of carriers of deletions, identifying 11 cases of maternal transmission and 4 of paternal transmission. The available data showed that all first-degree relatives carrying a deletion were also obese (Supplementary Table S1). In 10 instances the deletion was apparently de novo (see Figure 1b). Extrapolation to our full dataset indicates that ~0.4% of all morbidly obese cases are due to an inherited 16p11.2 deletion. The frequency of de novo events is consistent with the previous report where ascertainment was for developmental delay and/or congenital anomalies11; by contrast, deletions are reported to be almost exclusively de novo in autistic subjects8-10.
Although they may be heterogeneous in nature, these deletions are highly likely to be the causal variants, representing the second most frequent genetic cause of obesity after point mutations in MC4R22,23. Their repeated de novo occurrence is likely to result in lack of linkage disequilibrium with any other flanking variant – no consistent haplotype has been identified by analysis of the available surrounding genotypes. To assess the effect of a deletion on the expression of nearby genes (e.g. the obesity GWAS-associated SH2B1 locus 800kb away24), we analysed available transcript data for subcutaneous adipose tissue samples from the discordant sibling cohort. Comparisons of the 2 subjects carrying a deletion with their corresponding non-obese siblings, and with other obese and non-obese subjects (Supplementary Figure S4 and Supplementary Tables S4 and S5), showed that many though not all transcripts from within the deletion had markedly reduced abundance (0.4-0.7 fold). In contrast, no clear evidence was found for consistent cis effects of the deletion on the abundance of mRNAs encoded by genes flanking the deletion. In addition, global analysis of this dataset has not identified any trans expression quantitative trait loci either within or nearby the deletion.
Thus, while we cannot completely exclude that a 16p11.2 deletion affects the expression of nearby genes (for instance, its impact may be different in other tissues), the above expression analysis strongly indicates that the observed phenotypes are likely to be due to haploinsufficiency of one or more of the ~30 genes within the deleted region. Indeed, rather than being due to a single haploinsufficiency, the phenotype may well result from the deletion of multiple genes that impact on pathways central to the development of obesity (see Supplementary Table S5). Functional network analysis of the deleted genes has led to the suggestion of a similar multi-gene effect for the cognitive phenotype8. The extent to which there is overlap between the genes involved in the obesity and cognitive phenotypes remains to be elucidated.
There is a strong correlation between developmental and cognitive disabilities and the prevalence of obesity: Patients with autism or who have learning disabilities have a greatly increased risk of obesity25; and the severely obese exhibit significant cognitive impairment26. Possible explanations include a direct causal relationship between obesity and developmental delay; the involvement of the same or related regulatory pathways; or different outcomes of the same set of behavioural disorders with complex pleiotropic effects and variable ages of onset and expressivities. The higher frequency of 16p11.2 deletions in the cohort ascertained for both phenotypes (2.9%), compared to cohorts ascertained for either phenotype alone (0.4%, 0.6% respectively), confirms their impact on both obesity and developmental delay, adding to the evidence that these two phenotypes may be fundamentally interrelated.