We present a 19 year old male with a growth disorder, which was undefined, despite extensive evaluation. Whole exome sequencing demonstrated a novel homozygous frameshift mutation in CUL7, one of the causative genes of 3M syndrome. We discuss the utility of exome sequencing in diagnosing rare disorders.
3M Syndrome; Dwarfism; Genomics
Previous meta-analysis of genome-wide association (GWA) studies has identified 180 loci that influence adult height. However, each GWA locus typically comprises a set of contiguous genes, only one of which presumably modulates height. We reasoned that many of the causative genes within these loci influence height because they are expressed in and function in the growth plate, a cartilaginous structure that causes bone elongation and thus determines stature. Therefore, we used expression microarray studies of mouse and rat growth plate, human disease databases and a mouse knockout phenotype database to identify genes within the GWAS loci that are likely required for normal growth plate function. Each of these approaches identified significantly more genes within the GWA height loci than at random genomic locations (P < 0.0001 each), supporting the validity of the approach. The combined analysis strongly implicates 78 genes in growth plate function, including multiple genes that participate in PTHrP-IHH, BMP and CNP signaling, and many genes that have not previously been implicated in the growth plate. Thus, this analysis reveals a large number of novel genes that regulate human growth plate chondrogenesis and thereby contribute to the normal variations in human adult height. The analytic approach developed for this study may be applied to GWA studies for other common polygenic traits and diseases, thus providing a new general strategy to identify causative genes within GWA loci and to translate genetic associations into mechanistic biological insights.
Before the advent of genome-wide association studies (GWASs), hundreds of candidate genes for obesity-susceptibility had been identified through a variety of approaches. We examined whether those obesity candidate genes are enriched for associations with body mass index (BMI) compared with non-candidate genes by using data from a large-scale GWAS. A thorough literature search identified 547 candidate genes for obesity-susceptibility based on evidence from animal studies, Mendelian syndromes, linkage studies, genetic association studies and expression studies. Genomic regions were defined to include the genes ±10 kb of flanking sequence around candidate and non-candidate genes. We used summary statistics publicly available from the discovery stage of the genome-wide meta-analysis for BMI performed by the genetic investigation of anthropometric traits consortium in 123 564 individuals. Hypergeometric, rank tail-strength and gene-set enrichment analysis tests were used to test for the enrichment of association in candidate compared with non-candidate genes. The hypergeometric test of enrichment was not significant at the 5% P-value quantile (P = 0.35), but was nominally significant at the 25% quantile (P = 0.015). The rank tail-strength and gene-set enrichment tests were nominally significant for the full set of genes and borderline significant for the subset without SNPs at P < 10−7. Taken together, the observed evidence for enrichment suggests that the candidate gene approach retains some value. However, the degree of enrichment is small despite the extensive number of candidate genes and the large sample size. Studies that focus on candidate genes have only slightly increased chances of detecting associations, and are likely to miss many true effects in non-candidate genes, at least for obesity-related traits.
Melanocortin receptor accessory proteins (MRAPs) modulate signaling of melanocortin receptors in vitro. To investigate the physiological role of brain-expressed Melanocortin 2 Receptor Accessory Protein 2 (MRAP2), we characterized mice with whole body and brain-specific targeted deletion of Mrap2, both of which develop severe obesity at a young age. Mrap2 interacts directly with Melanocortin 4 Receptor (Mc4r), a protein previously implicated in mammalian obesity, and it enhances Mc4r-mediated generation of the second messenger cyclic AMP, suggesting that alterations in Mc4r signaling may be one mechanism underlying the association between Mrap2 disruption and obesity. In a study of humans with severe, early-onset obesity, we found four rare, potentially pathogenic genetic variants in MRAP2, suggesting that the gene may also contribute to body weight regulation in humans.
To investigate LIN28B gene variants in children with idiopathic central precocious puberty (CPP).
Patients and Methods
We studied 178 Brazilian children with CPP (171 girls,16.8% familial cases). A large multiethnic group (1599 subjects; MEC cohort) was used as control. DNA analysis and biochemical in vitro studies were performed.
A heterozygous LIN28B variant, p.H199R, was identified in a girl who developed CPP at 5.2 yrs. This variant was absent in 310 Brazilian control individuals, but it was found in the same allele frequency in women from the MEC cohort, independently of the age of menarche. Functional studies revealed that when ectopically expressed in cells the mutant protein was capable of binding pre-let-7 miRNA and inhibiting let-7 expression to the same extent as wild-type Lin28B protein. Other rare LIN28B variants (p.P173P, c.198+32_33delCT, g.9575731A>C and c.-11C>T) were identified in CPP patients and controls. Therefore, no functional mutation was identified.
In vitro studies revealed that the rare LIN28B p.H199R variant identified in a girl with CPP does not affect the Lin28B function in the regulation of let-7 expression. Although LIN28B SNPs were associated with normal pubertal timing, rare variations in this gene do not seem to be commonly involved in the molecular pathogenesis of CPP.
LIN28B; central precocious puberty; let-7; microRNA; early and late menarche
We used resequencing and genotyping in African Americans with sickle cell anemia (SCA) to characterize associations with fetal hemoglobin (HbF) levels at the BCL11A, HBS1L-MYB and β-globin loci. Fine-mapping of HbF association signals at these loci confirmed seven SNPs with independent effects and increased the explained heritable variation in HbF levels from 38.6% to 49.5%. We also identified rare missense variants that causally implicate MYB in HbF production.
Summary: Meta-analysis across genome-wide association studies is a common approach for discovering genetic associations. However, in some meta-analysis efforts, individual-level data cannot be broadly shared by study investigators due to privacy and Institutional Review Board concerns. In such cases, researchers cannot confirm that each study represents a unique group of people, leading to potentially inflated test statistics and false positives. To resolve this problem, we created a software tool, Gencrypt, which utilizes a security protocol known as one-way cryptographic hashes to allow overlapping participants to be identified without sharing individual-level data.
Availability: Gencrypt is freely available under the GNU general public license v3 at http://www.broadinstitute.org/software/gencrypt/
Supplementary data are available at Bioinformatics online.
Strong signatures of positive selection at newly arising genetic variants are well-documented in humans1–8, but this form of selection may not be widespread in recent human evolution9. Because many human traits are highly polygenic and partly determined by common, ancient genetic variation, an alternative model for rapid genetic adaptation has been proposed: weak selection acting on many pre-existing (standing) genetic variants, or polygenic adaptation10–12. By studying height, a classic polygenic trait, we demonstrate the first human signature of widespread selection on standing variation. We show that frequencies of alleles associated with increased height, both at known loci and genome-wide, are systematically elevated in Northern Europeans compared with Southern Europeans (p<4.3×10−4). This pattern mirrors intra-European height differences and is not confounded by ancestry or other ascertainment biases. The systematic frequency differences are consistent with the presence of widespread weak selection (selection coefficients ~10−3–10−5 per allele) rather than genetic drift alone (p<10−15).
Human Genomics; Population Genetics; Europeans; Height; Selection
Single nucleotide polymorphisms (SNPs) near 7 loci have been associated with liver function tests or with liver steatosis by magnetic resonance spectroscopy. In this study we aim to test whether these SNPs influence the risk of histologically-confirmed nonalcoholic fatty liver disease (NAFLD). We tested the association of histologic NAFLD with SNPs at 7 loci in 592 cases of European ancestry from the Nonalcoholic Steatohepatitis Clinical Research Network and 1405 ancestry-matched controls. The G allele of rs738409 in PNPLA3 was associated with increased odds of histologic NAFLD (odds ratio [OR] = 3.26, 95% confidence intervals [CI] = 2.11-7.21; P = 3.6 × 10−43). In a case only analysis of G allele of rs738409 in PNPLA3 was associated with a decreased risk of zone 3 centered steatosis (OR = 0.46, 95% CI = 0.36-0.58; P = 5.15 × 10−11). We did not observe any association of this variant with body mass index, triglyceride levels, high- and low-density lipoprotein levels, or diabetes (P > 0.05). None of the variants at the other 6 loci were associated with NAFLD.
Genetic variation at PNPLA3 confers a markedly increased risk of increasingly severe histological features of NAFLD, without a strong effect on metabolic syndrome component traits.
Puberty is an important developmental stage during which reproductive capacity is attained. The timing of puberty varies greatly among healthy individuals in the general population and is influenced by both genetic and environmental factors. Although genetic variation is known to influence the normal spectrum of pubertal timing, the specific genes involved remain largely unknown. Genetic analyses have identified a number of genes responsible for rare disorders of pubertal timing such as hypogonadotropic hypogonadism and Kallmann syndrome. Recently, the first loci with common variation reproducibly associated with population variation in the timing of puberty were identified at 6q21 in or near LIN28B and at 9q31.2. However, these two loci explain only a small fraction of the genetic contribution to population variation in pubertal timing, suggesting the need to continue to consider other loci and other types of variants. Here we provide an update of the genes implicated in disorders of puberty, discuss genes and pathways that may be involved in the timing of normal puberty, and suggest additional avenues of investigation to identify genetic regulators of puberty in the general population.
puberty; pubertal timing; genetics; hypogonadotropic hypgonadism; Kallmann syndrome; genetic regulation
Obesity is not uniformly associated with the development of metabolic sequelae. Specific patterns of body fat distribution, in particular fatty liver, may preferentially predispose at-risk individuals to disease. Here we characterize the metabolic correlates of fat in the liver in a large community-based sample with and without respect to visceral fat. Fatty liver was measured by multi-detector computed tomography of the abdomen in 2589 individuals from the community-based Framingham Heart Study (FHS). Logistic and linear regression were used to determine the associations of fatty liver with cardio-metabolic risk factors adjusted for covariates with and without adjustment for other fat depots (body mass index [BMI], waist circumference [WC], and visceral adipose tissue [VAT]). The prevalence of fatty liver was 17%. Compared to participants without fatty liver, individuals with fatty liver had a higher adjusted odds ratio (OR) of diabetes (DM; OR 2.98; 95% confidence interval [CI], 2.12–4.21), metabolic syndrome (MetS; OR, 5.22; 95% CI, 4.15–6.57), hypertension (HTN; OR 2.73; 95% CI, 2.16–3.44), impaired fasting glucose (IFG; OR 2.95; 95% CI, 2.32–3.75), insulin resistance (IR; OR, 6.16; 95% CI, 4.90 – 7.76), higher triglycerides (TG) and systolic and diastolic blood pressure (SBP, DBP) and lower high density lipoprotein (HDL) and adiponectin levels (p<0.001 for all). After adjustment for other fat depots, fatty liver remained associated with DM, HTN, IFG, MetS, HDL, TG and adiponectin levels (all p<.001), whereas associations with SBP and DBP were attenuated (p >0.05).
Fatty liver is a prevalent condition and is characterized by dysglycemia and dyslipidemia independent of VAT and other obesity measures. This work begins to dissect the specific links between fat depots and metabolic disease.
NAFLD; lipid; glucose; fat depot
Previous genetic studies have suggested a history of sub-Saharan African gene flow into some West Eurasian populations after the initial dispersal out of Africa that occurred at least 45,000 years ago. However, there has been no accurate characterization of the proportion of mixture, or of its date. We analyze genome-wide polymorphism data from about 40 West Eurasian groups to show that almost all Southern Europeans have inherited 1%–3% African ancestry with an average mixture date of around 55 generations ago, consistent with North African gene flow at the end of the Roman Empire and subsequent Arab migrations. Levantine groups harbor 4%–15% African ancestry with an average mixture date of about 32 generations ago, consistent with close political, economic, and cultural links with Egypt in the late middle ages. We also detect 3%–5% sub-Saharan African ancestry in all eight of the diverse Jewish populations that we analyzed. For the Jewish admixture, we obtain an average estimated date of about 72 generations. This may reflect descent of these groups from a common ancestral population that already had some African ancestry prior to the Jewish Diasporas.
Southern Europeans and Middle Eastern populations are known to have inherited a small percentage of their genetic material from recent sub-Saharan African migrations, but there has been no estimate of the exact proportion of this gene flow, or of its date. Here, we apply genomic methods to show that the proportion of African ancestry in many Southern European groups is 1%–3%, in Middle Eastern groups is 4%–15%, and in Jewish groups is 3%–5%. To estimate the dates when the mixture occurred, we develop a novel method that estimates the size of chromosomal segments of distinct ancestry in individuals of mixed ancestry. We verify using computer simulations that the method produces useful estimates of population mixture dates up to 300 generations in the past. By applying the method to West Eurasians, we show that the dates in Southern Europeans are consistent with events during the Roman Empire and subsequent Arab migrations. The dates in the Jewish groups are older, consistent with events in classical or biblical times that may have occurred in the shared history of Jewish populations.
Recently, genome-wide association studies (GWAS) have linked the human LIN28B locus to height and timing of menarche [1-5]. LIN28B and its homolog LIN28 (hereafter, LIN28A) are functionally redundant RNA-binding proteins that block let-7 microRNA (miRNA) biogenesis [6-9]. lin-28 and let-7 were discovered in C. elegans as heterochronic regulators of larval and vulval development, but recently have been implicated in cancer, stem cell aging, and pluripotency [10-13]. The let-7 targets Myc, Kras, Igf2bp1 and Hmga2 are known regulators of mammalian body size and metabolism [14-18]. To explore the Lin28/let-7 pathway in vivo, we engineered transgenic mice to express Lin28a and observed increased body size, crown-rump length, and a delayed onset of puberty. While investigating metabolic and endocrine mechanisms of overgrowth, we observed increased glucose metabolism and insulin sensitivity in these transgenic mice. We report a mouse that models the human phenotypes associated with genetic variation in the Lin28/let-7 pathway.
Background & Aims
Fatty liver is the hepatic manifestation of obesity, but community-based assessment of fatty liver among unselected subjects is limited. We sought to determine the feasibility of and optimal protocol for quantifying fat content in liver in the Framingham Heart Study using multi-detector computed tomography (MDCT) scanning.
Participants (n=100, 49% women, mean age 59.4 years, mean BMI 27.8 kg/m2) were drawn from the Framingham Heart Study Cohort. Two readers measured the attenuation of liver, spleen, paraspinal muscle, and an external standard from MDCT scans using multiple slices in chest and abdominal scans.
The mean measurement variation was larger within a single axial CT slice than between multiple axial CT slices for liver and spleen whereas it was similar for paraspinal muscles. Measurement variation in liver, spleen, and paraspinal muscles was smaller in the abdomen than in the chest. Three vs. six measures of attenuation in liver and two vs. three measures in spleen gave reproducible measurements of tissue attenuation (intra-class correlation coefficient (ICCC) of 1 in the abdomen). Intra- and inter-reader reproducibility (ICCC) of the liver-to-spleen ratio was 0.98 and 0.99, of the liver-to-phantom ratio was 0.99 and 0.99, and of the liver-to-muscle ratio was 0.93 and 0.86, respectively.
One cross-sectional slice is adequate to capture the majority of variance of fat content in liver per individual. Abdominal as compared to chest scan measures of fat content in liver are more precise. Measurement of fat content in liver on MDCT scans is feasible and reproducible.
Fatty Liver; reproducibility; CT scan; metabolic syndrome; measurement
Sex hormones, in particular the androgens, are important for the growth of the prostate gland and have been implicated in prostate cancer carcinogenesis, yet the determinants of endogenous steroid hormone levels remain poorly understood. Twin studies suggest a heritable component for circulating concentrations of sex hormones, although epidemiological evidence linking steroid hormone gene variants to prostate cancer is limited. Here we report on findings from a comprehensive study of genetic variation at the CYP19A1 locus in relation to prostate cancer risk and to circulating steroid hormone concentrations in men by the Breast and Prostate Cancer Cohort Consortium (BPC3), a large collaborative prospective study. The BPC3 systematically characterised variation in CYP19A1 by targeted resequencing and dense genotyping; selected haplotype-tagging single nucleotide polymorphisms (htSNPs) that efficiently predict common variants in U.S. and European whites, Latinos, Japanese Americans, and Native Hawaiians; and genotyped these htSNPs in 8,166 prostate cancer cases and 9,079 study-, age-, and ethnicity-matched controls. CYP19A1 htSNPs, two common missense variants and common haplotypes were not significantly associated with risk of prostate cancer. However, several htSNPs in linkage disequilibrium blocks 3 and 4 were significantly associated with a 5–10% difference in estradiol concentrations in men (association per copy of the two-SNP haplotype rs749292–rs727479 (A–A) versus noncarriers; P=1 × 10−5), and withinverse, although less marked changes, in free testosterone concentrations. These results suggest that although germline variation in CYP19A1 characterised by the htSNPs produces measurable differences in sex hormone concentrations in men, they do not substantially influence risk for prostate cancer.
prostate; cancer; CYP19A1; estradiol; testosterone
To investigate the genetic architecture of severe obesity, we performed a genome-wide association study of 775 cases and 3197 unascertained controls at ∼550 000 markers across the autosomal genome. We found convincing association to the previously described locus including the FTO gene. We also found evidence of association at a further six of 12 other loci previously reported to influence body mass index (BMI) in the general population and one of three associations to severe childhood and adult obesity and that cases have a higher proportion of risk-conferring alleles than controls. We found no evidence of homozygosity at any locus due to identity-by-descent associating with phenotype which would be indicative of rare, penetrant alleles, nor was there excess genome-wide homozygosity in cases relative to controls. Our results suggest that variants influencing BMI also contribute to severe obesity, a condition at the extreme of the phenotypic spectrum rather than a distinct condition.
Genome-wide association studies have successfully identified numerous loci at which common variants influence disease risk or quantitative traits. Despite these successes, the variants identified by these studies have generally explained only a small fraction of the heritable component of disease risk, and have not pinpointed with certainty the causal variant(s) at the associated loci. Furthermore, the mechanisms of action by which associated loci influence disease or quantitative phenotypes are often unclear, because we do not know through which gene(s) the associated variants exert their effects or because these gene(s) are of unknown function or have no clear connection to known disease biology. Thus, the initial set of genome-wide association studies serve as a starting point for future genetic and functional studies. We outline possible next steps that may help accelerate progress from genetic studies to the biological knowledge that can guide the development of predictive, preventive, or therapeutic measures.
High-throughput genotyping generates vast amounts of data for analysis; results can be difficult to summarize succinctly. A single project may involve genotyping many genes with multiple variants per gene and analyzing each variant in relation to numerous phenotypes, using several genetic models and population subgroups. Hundreds of statistical tests may be performed for a single SNP, thereby complicating interpretation of results and inhibiting identification of patterns of association.
To facilitate visual display and summary of large numbers of association tests of genetic loci with multiple phenotypes, we developed a Phenotype-Genotype Association (PGA) grid display. A database-backed web server was used to create PGA grids from phenotypic and genotypic data (sample sizes, means and standard errors, P-value for association). HTML pages were generated using Tcl scripts on an AOLserver platform, using an Oracle database, and the ArsDigita Community System web toolkit. The grids are interactive and permit display of summary data for individual cells by a mouse click (i.e. least squares means for a given SNP and phenotype, specified genetic model and study sample). PGA grids can be used to visually summarize results of individual SNP associations, gene-environment associations, or haplotype associations.
The PGA grid, which permits interactive exploration of large numbers of association test results, can serve as an easily adapted common and useful display format for large-scale genetic studies. Doing so would reduce the problem of publication bias, and would simplify the task of summarizing large-scale association studies.
Rare mutations in MEF2A have been proposed as a cause of coronary artery disease (CAD) and myocardial infarction (MI). In this issue of the JCI, Pennacchio and colleagues report sequencing MEF2A in 300 patients with premature CAD and in controls. Only 1 CAD patient was found to carry a missense mutation not found in controls. The specific 21-bp deletion in MEF2A previously proposed as causal for CAD and/or MI was observed in unaffected individuals and did not segregate with CAD in families. These results do not support the hypothesis that mutations in MEF2A are a cause of CAD and/or MI but do illustrate general principles regarding the difficulty of connecting genetic variation to common diseases.
The onset of puberty is first detected as an increase in pulsatile secretion of gonadotropin-releasing hormone (GnRH). Early activation of the hypothalamic–pituitary–gonadal axis results in central precocious puberty. The timing of pubertal development is driven in part by genetic factors, but only a few, rare molecular defects associated with central precocious puberty have been identified.
We performed whole-exome sequencing in 40 members of 15 families with central precocious puberty. Candidate variants were confirmed with Sanger sequencing. We also performed quantitative real-time polymerase-chain-reaction assays to determine levels of messenger RNA (mRNA) in the hypothalami of mice at different ages.
We identified four novel heterozygous mutations in MKRN3, the gene encoding makorin RING-finger protein 3, in 5 of the 15 families; both sexes were affected. The mutations included three frameshift mutations, predicted to encode truncated proteins, and one missense mutation, predicted to disrupt protein function. MKRN3 is a paternally expressed, imprinted gene located in the Prader–Willi syndrome critical region (chromosome 15q11–q13). All affected persons inherited the mutations from their fathers, a finding that indicates perfect segregation with the mode of inheritance expected for an imprinted gene. Levels of Mkrn3 mRNA were high in the arcuate nucleus of prepubertal mice, decreased immediately before puberty, and remained low after puberty.
Deficiency of MKRN3 causes central precocious puberty in humans. (Funded by the National Institutes of Health and others.)
We formed the GEnetics of Nephropathy–an International Effort (GENIE) consortium to examine previously reported genetic associations with diabetic nephropathy (DN) in type 1 diabetes. GENIE consists of 6,366 similarly ascertained participants of European ancestry with type 1 diabetes, with and without DN, from the All Ireland-Warren 3-Genetics of Kidneys in Diabetes U.K. and Republic of Ireland (U.K.-R.O.I.) collection and the Finnish Diabetic Nephropathy Study (FinnDiane), combined with reanalyzed data from the Genetics of Kidneys in Diabetes U.S. Study (U.S. GoKinD). We found little evidence for the association of the EPO promoter polymorphism, rs161740, with the combined phenotype of proliferative retinopathy and end-stage renal disease in U.K.-R.O.I. (odds ratio [OR] 1.14, P = 0.19) or FinnDiane (OR 1.06, P = 0.60). However, a fixed-effects meta-analysis that included the previously reported cohorts retained a genome-wide significant association with that phenotype (OR 1.31, P = 2 × 10−9). An expanded investigation of the ELMO1 locus and genetic regions reported to be associated with DN in the U.S. GoKinD yielded only nominal statistical significance for these loci. Finally, top candidates identified in a recent meta-analysis failed to reach genome-wide significance. In conclusion, we were unable to replicate most of the previously reported genetic associations for DN, and significance for the EPO promoter association was attenuated.
We present an approximate conditional and joint association analysis that can use summary-level statistics from a meta-analysis of genome-wide association studies (GWAS) and estimated linkage disequilibrium (LD) from a reference sample with individual-level genotype data. Using this method, we analyzed meta-analysis summary data from the GIANT Consortium for height and body mass index (BMI), with the LD structure estimated from genotype data in two independent cohorts. We identified 36 loci with multiple associated variants for height (38 leading and 49 additional SNPs, 87 in total) via a genome-wide SNP selection procedure. The 49 new SNPs explain approximately 1.3% of variance, nearly doubling the heritability explained at the 36 loci. We did not find any locus showing multiple associated SNPs for BMI. The method we present is computationally fast and is also applicable to case-control data, which we demonstrate in an example from meta-analysis of type 2 diabetes by the DIAGRAM Consortium.
We examined the association of common variants at the NPPA-NPPB locus with circulating concentrations of the natriuretic peptides, which have blood pressure–lowering properties. We genotyped SNPs at the NPPA-NPPB locus in 14,743 individuals of European ancestry, and identified associations of plasma atrial natriuretic peptide with rs5068 (P = 8 × 10−70), rs198358 (P = 8 × 10−30) and rs632793 (P = 2 × 10−10), and of plasma B-type natriuretic peptide with rs5068 (P = 3 × 10−12), rs198358 (P = 1 × 10−25) and rs632793 (P = 2 × 10−68). In 29,717 individuals, the alleles of rs5068 and rs198358 that showed association with increased circulating natriuretic peptide concentrations were also found to be associated with lower systolic (P = 2 × 10−6 and 6 × 10−5, respectively) and diastolic blood pressure (P = 1 × 10−6 and 5 × 10−5), as well as reduced odds of hypertension (OR = 0.85, 95% CI = 0.79–0.92, P = 4 × 10−5; OR = 0.90, 95% CI = 0.85–0.95, P = 2 × 10−4, respectively). Common genetic variants at the NPPA-NPPB locus found to be associated with circulating natriuretic peptide concentrations contribute to interindividual variation in blood pressure and hypertension.