PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (27)
 

Clipboard (0)
None

Select a Filter Below

Journals
more »
Year of Publication
Document Types
1.  Quantifying and correcting for the winner's curse in quantitative-trait association studies 
Genetic epidemiology  2011;35(3):133-138.
Quantitative traits (QT) are an important focus of human genetic studies both because of interest in the traits themselves, and because of their role as risk factors for many human diseases. For large-scale QT association studies including genome-wide association studies (GWAS), investigators usually focus on genetic loci showing significant evidence for SNP-QT association, and genetic effect size tends to be overestimated as a consequence of the winner’s curse. In this paper, we study the impact of the winner’s curse on QT association studies in which the genetic effect size is parameterized as the slope in a linear regression model. We demonstrate by analytical calculation that the overestimation in the regression slope estimate decreases as power increases. To reduce the ascertainment bias, we propose a three-parameter maximum likelihood method and then simplify this to a one-parameter method by excluding nuisance parameters. We show that both methods reduce the bias when power to detect association is low or moderate, and that the one-parameter model generally results in smaller variance in the estimate.
doi:10.1002/gepi.20551
PMCID: PMC3500533  PMID: 21284035
quantitative trait; winner’s curse; ascertainment bias; genome-wide association study; linear regression; maximum likelihood
2.  Effects of 34 Risk Loci for Type 2 Diabetes or Hyperglycemia on Lipoprotein Subclasses and Their Composition in 6,580 Nondiabetic Finnish Men 
Diabetes  2011;60(5):1608-1616.
OBJECTIVE
We investigated the effects of 34 genetic risk variants for hyperglycemia/type 2 diabetes on lipoprotein subclasses and particle composition in a large population-based cohort.
RESEARCH DESIGN AND METHODS
The study included 6,580 nondiabetic Finnish men from the population-based Metabolic Syndrome in Men (METSIM) study (aged 57 ± 7 years; BMI 26.8 ± 3.7 kg/m2). Genotyping of 34 single nucleotide polymorphism (SNPs) for hyperglycemia/type 2 diabetes was performed. Proton nuclear magnetic resonance spectroscopy was used to measure particle concentrations of 14 lipoprotein subclasses and their composition in native serum samples.
RESULTS
The glucose-increasing allele of rs780094 in GCKR was significantly associated with low concentrations of VLDL particles (independently of their size) and small LDL and was nominally associated with low concentrations of intermediate-density lipoprotein, all LDL subclasses, and high concentrations of very large and large HDL particles. The glucose-increasing allele of rs174550 in FADS1 was significantly associated with high concentrations of very large and large HDL particles and nominally associated with low concentrations of all VLDL particles. SNPs rs10923931 in NOTCH2 and rs757210 in HNF1B genes showed nominal or significant associations with several lipoprotein traits. The genetic risk score of 34 SNPs was not associated with any of the lipoprotein subclasses.
CONCLUSIONS
Four of the 34 risk loci for type 2 diabetes or hyperglycemia (GCKR, FADS1, NOTCH2, and HNF1B) were significantly associated with lipoprotein traits. A GCKR variant predominantly affected the concentration of VLDL, and the FADS1 variant affected very large and large HDL particles. Only a limited number of risk loci for hyperglycemia/type 2 diabetes significantly affect lipoprotein metabolism.
doi:10.2337/db10-1655
PMCID: PMC3292337  PMID: 21421807
3.  Common Variants Show Predicted Polygenic Effects on Height in the Tails of the Distribution, Except in Extremely Short Individuals 
PLoS Genetics  2011;7(12):e1002439.
Common genetic variants have been shown to explain a fraction of the inherited variation for many common diseases and quantitative traits, including height, a classic polygenic trait. The extent to which common variation determines the phenotype of highly heritable traits such as height is uncertain, as is the extent to which common variation is relevant to individuals with more extreme phenotypes. To address these questions, we studied 1,214 individuals from the top and bottom extremes of the height distribution (tallest and shortest ∼1.5%), drawn from ∼78,000 individuals from the HUNT and FINRISK cohorts. We found that common variants still influence height at the extremes of the distribution: common variants (49/141) were nominally associated with height in the expected direction more often than is expected by chance (p<5×10−28), and the odds ratios in the extreme samples were consistent with the effects estimated previously in population-based data. To examine more closely whether the common variants have the expected effects, we calculated a weighted allele score (WAS), which is a weighted prediction of height for each individual based on the previously estimated effect sizes of the common variants in the overall population. The average WAS is consistent with expectation in the tall individuals, but was not as extreme as expected in the shortest individuals (p<0.006), indicating that some of the short stature is explained by factors other than common genetic variation. The discrepancy was more pronounced (p<10−6) in the most extreme individuals (height<0.25 percentile). The results at the extreme short tails are consistent with a large number of models incorporating either rare genetic non-additive or rare non-genetic factors that decrease height. We conclude that common genetic variants are associated with height at the extremes as well as across the population, but that additional factors become more prominent at the shorter extreme.
Author Summary
Although there are many loci in the human genome that have been discovered to be significantly associated with height, it is unclear if these loci have similar effects in extremely tall and short individuals. Here, we examine hundreds of extremely tall and short individuals in two population-based cohorts to see if these known height determining loci are as predictive as expected in these individuals. We found that these loci are generally as predictive of height as expected in these individuals but that they begin to be less predictive in the most extremely short individuals. We showed that this result is consistent with models that not only include the common variants but also multiple low frequency genetic variants that substantially decrease height. However, this result is also consistent with non-additive genetic effects or rare non-genetic factors that substantially decrease height. This finding suggests the possibility of a major role of low frequency variants, particularly in individuals with extreme phenotypes, and has implications on whole-genome or whole-exome sequencing efforts to discover rare genetic variation associated with complex traits.
doi:10.1371/journal.pgen.1002439
PMCID: PMC3248463  PMID: 22242009
4.  Global epigenomic analysis of primary human pancreatic islets provides insights into type 2 diabetes susceptibility loci 
Cell metabolism  2010;12(5):443-455.
Summary
Identifying cis-regulatory elements is important to understand how human pancreatic islets modulate gene expression in physiologic or pathophysiologic (e.g., diabetic) conditions. We conducted genome-wide analysis of DNase I hypersensitive sites, histone H3 lysine methylation modifications (K4me1, K4me3, K79me2), and CCCTC factor (CTCF) binding in human islets. This identified ~18,000 putative promoters (several hundred unannotated and islet-active). Surprisingly, active promoter modifications were absent at genes encoding islet-specific hormones, suggesting a distinct regulatory mechanism. Of 34,039 distal (non-promoter) regulatory elements, 47% are islet-unique and 22% are CTCF-bound. In the 18 type 2 diabetes (T2D)-associated loci, we identified 118 putative regulatory elements and confirmed enhancer activity for 12/33 tested. Among 6 regulatory elements harboring T2D-associated variants, 2 exhibit significant allele-specific differences in activity. These findings present a global snapshot of the human islet epigenome and should provide functional context for non-coding variants emerging from genetic studies of T2D and other islet disorders.
doi:10.1016/j.cmet.2010.09.012
PMCID: PMC3026436  PMID: 21035756
5.  Fine Mapping of Five Loci Associated with Low-Density Lipoprotein Cholesterol Detects Variants That Double the Explained Heritability 
PLoS Genetics  2011;7(7):e1002198.
Complex trait genome-wide association studies (GWAS) provide an efficient strategy for evaluating large numbers of common variants in large numbers of individuals and for identifying trait-associated variants. Nevertheless, GWAS often leave much of the trait heritability unexplained. We hypothesized that some of this unexplained heritability might be due to common and rare variants that reside in GWAS identified loci but lack appropriate proxies in modern genotyping arrays. To assess this hypothesis, we re-examined 7 genes (APOE, APOC1, APOC2, SORT1, LDLR, APOB, and PCSK9) in 5 loci associated with low-density lipoprotein cholesterol (LDL-C) in multiple GWAS. For each gene, we first catalogued genetic variation by re-sequencing 256 Sardinian individuals with extreme LDL-C values. Next, we genotyped variants identified by us and by the 1000 Genomes Project (totaling 3,277 SNPs) in 5,524 volunteers. We found that in one locus (PCSK9) the GWAS signal could be explained by a previously described low-frequency variant and that in three loci (PCSK9, APOE, and LDLR) there were additional variants independently associated with LDL-C, including a novel and rare LDLR variant that seems specific to Sardinians. Overall, this more detailed assessment of SNP variation in these loci increased estimates of the heritability of LDL-C accounted for by these genes from 3.1% to 6.5%. All association signals and the heritability estimates were successfully confirmed in a sample of ∼10,000 Finnish and Norwegian individuals. Our results thus suggest that focusing on variants accessible via GWAS can lead to clear underestimates of the trait heritability explained by a set of loci. Further, our results suggest that, as prelude to large-scale sequencing efforts, targeted re-sequencing efforts paired with large-scale genotyping will increase estimates of complex trait heritability explained by known loci.
Author Summary
Despite the striking success of genome-wide association studies in identifying genetic loci associated with common complex traits and diseases, much of the heritable risk for these traits and diseases remains unexplained. A higher resolution investigation of the genome through sequencing studies is expected to clarify the sources of this missing heritability. As a preview of what we might learn in these more detailed assessments of genetic variation, we used sequencing to identify potentially interesting variants in seven genes associated with low-density lipoprotein cholesterol (LDL-C) in 256 Sardinian individuals with extreme LDL-C levels, followed by large scale genotyping in 5,524 individuals, to examine newly discovered and previously described variants. We found that a combination of common and rare variants in these loci contributes to variation in LDL-C levels, and also that the initial estimate of the heritability explained by these loci doubled. Importantly, our results include a Sardinian-specific rare variant, highlighting the need for sequencing studies in isolated populations. Our results provide insights about what extensive whole-genome sequencing efforts are likely to reveal for the understanding of the genetic architecture of complex traits.
doi:10.1371/journal.pgen.1002198
PMCID: PMC3145627  PMID: 21829380
6.  Detailed Physiologic Characterization Reveals Diverse Mechanisms for Novel Genetic Loci Regulating Glucose and Insulin Metabolism in Humans 
Diabetes  2010;59(5):1266-1275.
OBJECTIVE
Recent genome-wide association studies have revealed loci associated with glucose and insulin-related traits. We aimed to characterize 19 such loci using detailed measures of insulin processing, secretion, and sensitivity to help elucidate their role in regulation of glucose control, insulin secretion and/or action.
RESEARCH DESIGN AND METHODS
We investigated associations of loci identified by the Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC) with circulating proinsulin, measures of insulin secretion and sensitivity from oral glucose tolerance tests (OGTTs), euglycemic clamps, insulin suppression tests, or frequently sampled intravenous glucose tolerance tests in nondiabetic humans (n = 29,084).
RESULTS
The glucose-raising allele in MADD was associated with abnormal insulin processing (a dramatic effect on higher proinsulin levels, but no association with insulinogenic index) at extremely persuasive levels of statistical significance (P = 2.1 × 10−71). Defects in insulin processing and insulin secretion were seen in glucose-raising allele carriers at TCF7L2, SCL30A8, GIPR, and C2CD4B. Abnormalities in early insulin secretion were suggested in glucose-raising allele carriers at MTNR1B, GCK, FADS1, DGKB, and PROX1 (lower insulinogenic index; no association with proinsulin or insulin sensitivity). Two loci previously associated with fasting insulin (GCKR and IGF1) were associated with OGTT-derived insulin sensitivity indices in a consistent direction.
CONCLUSIONS
Genetic loci identified through their effect on hyperglycemia and/or hyperinsulinemia demonstrate considerable heterogeneity in associations with measures of insulin processing, secretion, and sensitivity. Our findings emphasize the importance of detailed physiological characterization of such loci for improved understanding of pathways associated with alterations in glucose homeostasis and eventually type 2 diabetes.
doi:10.2337/db09-1568
PMCID: PMC2857908  PMID: 20185807
7.  Genome-wide association studies in diverse populations 
Nature reviews. Genetics  2010;11(5):356-366.
Genome-wide association (GWA) studies have identified a large number of single-nucleotide polymorphisms (SNPs) associated with disease phenotypes. As most GWA studies have been performed primarily in populations of European descent, this review examines the issues involved in extending consideration of GWA studies to diverse worldwide populations. Although challenges exist with such issues as imputation, admixture, and replication, investigation of diverse populations in GWA studies has significant potential to advance the project of mapping the genetic determinants of complex diseases for the human population as a whole.
doi:10.1038/nrg2760
PMCID: PMC3079573  PMID: 20395969
8.  Transferability of Type 2 Diabetes Implicated Loci in Multi-Ethnic Cohorts from Southeast Asia 
PLoS Genetics  2011;7(4):e1001363.
Recent large genome-wide association studies (GWAS) have identified multiple loci which harbor genetic variants associated with type 2 diabetes mellitus (T2D), many of which encode proteins not previously suspected to be involved in the pathogenesis of T2D. Most GWAS for T2D have focused on populations of European descent, and GWAS conducted in other populations with different ancestry offer a unique opportunity to study the genetic architecture of T2D. We performed genome-wide association scans for T2D in 3,955 Chinese (2,010 cases, 1,945 controls), 2,034 Malays (794 cases, 1,240 controls), and 2,146 Asian Indians (977 cases, 1,169 controls). In addition to the search for novel variants implicated in T2D, these multi-ethnic cohorts serve to assess the transferability and relevance of the previous findings from European descent populations in the three major ethnic populations of Asia, comprising half of the world's population. Of the SNPs associated with T2D in previous GWAS, only variants at CDKAL1 and HHEX/IDE/KIF11 showed the strongest association with T2D in the meta-analysis including all three ethnic groups. However, consistent direction of effect was observed for many of the other SNPs in our study and in those carried out in European populations. Close examination of the associations at both the CDKAL1 and HHEX/IDE/KIF11 loci provided some evidence of locus and allelic heterogeneity in relation to the associations with T2D. We also detected variation in linkage disequilibrium between populations for most of these loci that have been previously identified. These factors, combined with limited statistical power, may contribute to the failure to detect associations across populations of diverse ethnicity. These findings highlight the value of surveying across diverse racial/ethnic groups towards the fine-mapping efforts for the casual variants and also of the search for variants, which may be population-specific.
Author Summary
Type 2 diabetes mellitus (T2D) is a chronic disease which can lead to complications such as heart disease, stroke, hypertension, blindness due to diabetic retinopathy, amputations from peripheral vascular diseases, and kidney disease from diabetic nephropathy. The increasing prevalence and complications of T2D are likely to increase the health and economic burden of individuals, families, health systems, and countries. Our study carried out in three major Asian ethnic groups (Chinese, Malays, and Indians) in Singapore suggests that the findings of studies carried out in populations of European ancestry (which represents most studies to date) may be relevant to populations in Asia. However, our study also raises the possibility that different genes, and within the genes different variants, may confer susceptibility to T2D in these populations. These findings are particularly relevant in Asia, where the greatest growth of T2D is expected in the coming years, and emphasize the importance of studying diverse populations when trying to localize the regions of the genome associated with T2D. In addition, we may need to consider novel methods for combining data across populations.
doi:10.1371/journal.pgen.1001363
PMCID: PMC3072366  PMID: 21490949
9.  META-ANALYSIS OF GENETIC ASSOCIATION STUDIES AND ADJUSTMENT FOR MULTIPLE TESTING OF CORRELATED SNPS AND TRAITS 
Genetic epidemiology  2010;34(7):739-746.
Meta-analysis has become a key component of well-designed genetic association studies due to the boost in statistical power achieved by combining results across multiple samples of individuals and the need to validate observed associations in independent studies. Meta-analyses of genetic association studies based on multiple SNPs and traits are subject to the same multiple testing issues as single-sample studies, but it is often difficult to adjust accurately for the multiple tests. Procedures such as Bonferroni may control the type I error rate but will generally provide an overly harsh correction if SNPs or traits are correlated. Depending on study design, availability of individual-level data, and computational requirements, permutation testing may not be feasible in a meta-analysis framework. In this paper we present methods for adjusting for multiple correlated tests under several study designs commonly employed in meta-analyses of genetic association tests. Our methods are applicable to both prospective meta-analyses in which several samples of individuals are analyzed with the intent to combine results, and retrospective meta-analyses, in which results from published studies are combined, including situations in which 1) individual-level data are unavailable, and 2) different sets of SNPs are genotyped in different studies due to random missingness or two-stage design. We show through simulation that our methods accurately control the rate of type I error and achieve improved power over multiple testing adjustments that do not account for correlation between SNPs or traits.
doi:10.1002/gepi.20538
PMCID: PMC3070606  PMID: 20878715
meta-analysis; association study; multiple testing; SNPs
10.  A Variance-Component Framework for Pedigree Analysis of Continuous and Categorical Outcomes 
Statistics in biosciences  2009;1(2):181-198.
SUMMARY
Variance-component methods are popular and flexible analytic tools for elucidating the genetic mechanisms of complex quantitative traits from pedigree data. However, variance-component methods typically assume that the trait of interest follows a multivariate normal distribution within a pedigree. Studies have shown that violation of this normality assumption can lead to biased parameter estimates and inflations in type-I error. This limits the application of variance-component methods to more general trait outcomes, whether continuous or categorical in nature. In this paper, we develop and apply a general variance-component framework for pedigree analysis of continuous and categorical outcomes. We develop appropriate models using generalized-linear mixed model theory and fit such models using approximate maximum-likelihood procedures. Using our proposed method, we demonstrate that one can perform variance-component pedigree analysis on outcomes that follow any exponential-family distribution. Additionally, we also show how one can modify the method to perform pedigree analysis of ordinal outcomes. We also discuss extensions of our variance-component framework to accommodate pedigrees ascertained based on trait outcome. We demonstrate the feasibility of our method using both simulated data and data from a genetic study of ovarian insufficiency.
doi:10.1007/s12561-009-9010-5
PMCID: PMC2860148  PMID: 20436936
Variance component model; linkage analysis; generalized linear mixed model
11.  Association of 18 Confirmed Susceptibility Loci for Type 2 Diabetes With Indices of Insulin Release, Proinsulin Conversion, and Insulin Sensitivity in 5,327 Nondiabetic Finnish Men 
Diabetes  2009;58(9):2129-2136.
OBJECTIVE
We investigated the effects of 18 confirmed type 2 diabetes risk single nucleotide polymorphisms (SNPs) on insulin sensitivity, insulin secretion, and conversion of proinsulin to insulin.
RESEARCH DESIGN AND METHODS
A total of 5,327 nondiabetic men (age 58 ± 7 years, BMI 27.0 ± 3.8 kg/m2) from a large population-based cohort were included. Oral glucose tolerance tests and genotyping of SNPs in or near PPARG, KCNJ11, TCF7L2, SLC30A8, HHEX, LOC387761, CDKN2B, IGF2BP2, CDKAL1, HNF1B, WFS1, JAZF1, CDC123, TSPAN8, THADA, ADAMTS9, NOTCH2, KCNQ1, and MTNR1B were performed. HNF1B rs757210 was excluded because of failure to achieve Hardy-Weinberg equilibrium.
RESULTS
Six SNPs (TCF7L2, SLC30A8, HHEX, CDKN2B, CDKAL1, and MTNR1B) were significantly (P < 6.9 × 10−4) and two SNPs (KCNJ11 and IGF2BP2) were nominally (P < 0.05) associated with early-phase insulin release (InsAUC0–30/GluAUC0–30), adjusted for age, BMI, and insulin sensitivity (Matsuda ISI). Combined effects of these eight SNPs reached −32% reduction in InsAUC0–30/GluAUC0–30 in carriers of ≥11 vs. ≤3 weighted risk alleles. Four SNPs (SLC30A8, HHEX, CDKAL1, and TCF7L2) were significantly or nominally associated with indexes of proinsulin conversion. Three SNPs (KCNJ11, HHEX, and TSPAN8) were nominally associated with Matsuda ISI (adjusted for age and BMI). The effect of HHEX on Matsuda ISI became significant after additional adjustment for InsAUC0–30/GluAUC0–30. Nine SNPs did not show any associations with examined traits.
CONCLUSIONS
Eight type 2 diabetes–related loci were significantly or nominally associated with impaired early-phase insulin release. Effects of SLC30A8, HHEX, CDKAL1, and TCF7L2 on insulin release could be partially explained by impaired proinsulin conversion. HHEX might influence both insulin release and insulin sensitivity.
doi:10.2337/db09-0117
PMCID: PMC2731523  PMID: 19502414
12.  Genotype-Based Matching to Correct for Population Stratification in Large-Scale Case-Control Genetic Association Studies 
Genetic epidemiology  2009;33(6):508-517.
Genome-wide association studies are helping to dissect the etiology of complex diseases. Although case-control association tests are generally more powerful than family-based association tests, population stratification can lead to spurious disease-marker association or mask a true association. Several methods have been proposed to match cases and controls prior to genotyping, using family information or epidemiological data, or using genotype data for a modest number of genetic markers. Here, we describe a genetic similarity score matching (GSM) method for efficient matched analysis of cases and controls in a genome-wide or large-scale candidate gene association study. GSM is comprised of three steps: 1) calculating similarity scores for pairs of individuals using the genotype data; 2) matching sets of cases and controls based on the similarity scores so that matched cases and controls have similar genetic background; and 3) using conditional logistic regression to perform association tests. Through computer simulation we show that GSM correctly controls false positive rates and improves power to detect true disease predisposing variants. We compare GSM to genomic control using computer simulations, and find improved power using GSM. We suggest that initial matching of cases and controls prior to genotyping combined with careful re-matching after genotyping is a method of choice for genome-wide association studies.
doi:10.1002/gepi.20403
PMCID: PMC2732762  PMID: 19170134
population stratification; genome-wide association; genetic similarity
13.  Underlying Genetic Models of Inheritance in Established Type 2 Diabetes Associations 
American Journal of Epidemiology  2009;170(5):537-545.
For most associations of common single nucleotide polymorphisms (SNPs) with common diseases, the genetic model of inheritance is unknown. The authors extended and applied a Bayesian meta-analysis approach to data from 19 studies on 17 replicated associations with type 2 diabetes. For 13 SNPs, the data fitted very well to an additive model of inheritance for the diabetes risk allele; for 4 SNPs, the data were consistent with either an additive model or a dominant model; and for 2 SNPs, the data were consistent with an additive or recessive model. Results were robust to the use of different priors and after exclusion of data for which index SNPs had been examined indirectly through proxy markers. The Bayesian meta-analysis model yielded point estimates for the genetic effects that were very similar to those previously reported based on fixed- or random-effects models, but uncertainty about several of the effects was substantially larger. The authors also examined the extent of between-study heterogeneity in the genetic model and found generally small between-study deviation values for the genetic model parameter. Heterosis could not be excluded for 4 SNPs. Information on the genetic model of robustly replicated association signals derived from genome-wide association studies may be useful for predictive modeling and for designing biologic and functional experiments.
doi:10.1093/aje/kwp145
PMCID: PMC2732984  PMID: 19602701
Bayes theorem; diabetes mellitus, type 2; meta-analysis; models, genetic; polymorphism, genetic; population characteristics
14.  Subsets of Finns with High HDL to Total Cholesterol Ratio Show Evidence for Linkage to Type 2 Diabetes on Chromosome 6q 
Human heredity  2006;63(1):17-25.
Objectives
The purpose of this study was to examine carefully heterogeneity underlying evidence for linkage to type 2 diabetes (T2DM) on chromosome 6q from two sets of FUSION families.
Methods
Ordered subsets analysis (OSA) was performed on two sets of FUSION families. For OSA results showing significant improvement in evidence for linkage, T2DM-related phenotypes were compared between individuals with T2DM within the subset versus the complement.
Results
OSA analysis revealed 105 families with the highest average HDL to total cholesterol ratio (HDL ratio) that had strongly increased evidence for linkage (MLS = 7.91 at 78.0 cM; uncorrected p = 0.00002). Subjects with T2DM within this subset were significantly leaner, had lower fasting glucose, insulin, and C-peptide, and more favorable cardiovascular risk profile compared to the complement set of subjects with T2DM. OSA also revealed 33 families with the lowest average fasting insulin that had increased evidence for linkage at a second locus (MLS = 3.45 at 128 cM; uncorrected p = 0.017) coincident with quantitative trait locus linkage analysis results for fasting and 2-hour insulin in subjects without T2DM.
Conclusions
These results suggest two diabetes susceptibility loci on chromosome 6q that may affect subsets of individuals with a milder form of T2DM.
doi:10.1159/000097927
PMCID: PMC2923439  PMID: 17179727
Linkage analysis; Heterogeneity; Type 2 diabetes; HDL cholesterol; Ordered subsets analysis; Chromosome 6q
15.  Common variants in the GDF5-BFZB region are associated with variation in human height 
Nature genetics  2008;40(2):198-203.
Identifying genetic variants that influence human height will further our understanding of skeletal growth and development. A number of rare genetic variants have been convincingly and reproducibly associated with height in Mendelian syndromes, and common variants in HMGA2 were recently found to be associated with variation in height in the general population1. Here, we report genome-wide association analyses of 6,669 individuals from Finland and Sardinia, using genotyped and imputed markers, and follow-up in an additional 28,801 individuals. We show that common variants in the osteoarthritis-associated2 GDF5-BFZB locus are responsible for variation in height (estimated additive effect of 0.44 cm, overall p<10−15). Our results suggest a link between the genetic basis of height and osteoarthritis, potentially mediated through alterations in bone growth and development.
doi:10.1038/ng.74
PMCID: PMC2914680  PMID: 18193045
16.  LocusZoom: regional visualization of genome-wide association scan results 
Bioinformatics  2010;26(18):2336-2337.
Summary: Genome-wide association studies (GWAS) have revealed hundreds of loci associated with common human genetic diseases and traits. We have developed a web-based plotting tool that provides fast visual display of GWAS results in a publication-ready format. LocusZoom visually displays regional information such as the strength and extent of the association signal relative to genomic position, local linkage disequilibrium (LD) and recombination patterns and the positions of genes in the region.
Availability: LocusZoom can be accessed from a web interface at http://csg.sph.umich.edu/locuszoom. Users may generate a single plot using a web form, or many plots using batch mode. The software utilizes LD information from HapMap Phase II (CEU, YRI and JPT+CHB) or 1000 Genomes (CEU) and gene information from the UCSC browser, and will accept SNP identifiers in dbSNP or 1000 Genomes format. Single plots are generated in ∼20 s. Source code and associated databases are available for download and local installation, and full documentation is available online.
Contact: cristen@umich.edu
doi:10.1093/bioinformatics/btq419
PMCID: PMC2935401  PMID: 20634204
17.  Quantifying and correcting for the winner's curse in genetic association studies 
Genetic epidemiology  2009;33(5):453-462.
Genetic association studies are a powerful tool to detect genetic variants that predispose to human disease. Once an associated variant is identified, investigators are also interested in estimating the effect of the identified variant on disease risk. Estimates of the genetic effect based on new association findings tend to be upwardly biased due to a phenomenon known as the “winner's curse”. Overestimation of genetic effect size in initial studies may cause follow-up studies to be underpowered and so to fail. In this paper, we quantify the impact of the winner's curse on the allele frequency difference and odds ratio estimators for one- and two-stage case-control association studies. We then propose an ascertainment-corrected maximum likelihood method to reduce the bias of these estimators. We show that overestimation of the genetic effect by the uncorrected estimator decreases as the power of the association study increases and that the ascertainment-corrected method reduces absolute bias and mean square error unless power to detect association is high.
doi:10.1002/gepi.20398
PMCID: PMC2706290  PMID: 19140131
winner's curse; ascertainment bias; genome-wide association study; maximum likelihood
18.  Evaluation of genome-wide association study results through development of ontology fingerprints 
Bioinformatics  2009;25(10):1314-1320.
Motivation: Genome-wide association (GWA) studies may identify multiple variants that are associated with a disease or trait. To narrow down candidates for further validation, quantitatively assessing how identified genes relate to a phenotype of interest is important.
Results: We describe an approach to characterize genes or biological concepts (phenotypes, pathways, diseases, etc.) by ontology fingerprint—the set of Gene Ontology (GO) terms that are overrepresented among the PubMed abstracts discussing the gene or biological concept together with the enrichment p-value of these terms generated from a hypergeometric enrichment test. We then quantify the relevance of genes to the trait from a GWA study by calculating similarity scores between their ontology fingerprints using enrichment p-values. We validate this approach by correctly identifying corresponding genes for biological pathways with a 90% average area under the ROC curve (AUC). We applied this approach to rank genes identified through a GWA study that are associated with the lipid concentrations in plasma as well as to prioritize genes within linkage disequilibrium (LD) block. We found that the genes with highest scores were: ABCA1, lipoprotein lipase (LPL) and cholesterol ester transfer protein, plasma for high-density lipoprotein; low-density lipoprotein receptor, APOE and APOB for low-density lipoprotein; and LPL, APOA1 and APOB for triglyceride. In addition, we identified genes relevant to lipid metabolism from the literature even in cases where such knowledge was not reflected in current annotation of these genes. These results demonstrate that ontology fingerprints can be used effectively to prioritize genes from GWA studies for experimental validation.
Contact: zhengw@musc.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btp158
PMCID: PMC2732313  PMID: 19349285
19.  Underlying genetic models of inheritance in established type 2 diabetes associations 
American journal of epidemiology  2009;170(5):537-545.
For most associations of common polymorphisms with common diseases, the genetic model of inheritance is unknown. We extended and applied a Bayesian meta-analysis approach to data from 19 studies on 17 replicated associations for type 2 diabetes. For 13 polymorphisms, the data fit very well to an additive model, for 4 polymorphisms the data were consistent with either an additive or dominant model, and for 2 polymorphisms with an additive or recessive model of inheritance for the diabetes risk allele. Results were robust to using different priors and after excluding data where index polymorphisms had been examined indirectly through proxy markers. The Bayesian meta-analysis model yielded point estimates for the genetic effects that are very similar to those previously reported based on fixed or random effects models, but uncertainty about several of the effects was substantially larger. We also examined the extent of between-study heterogeneity in the genetic model and found generally small values of the between-study deviation for the genetic model parameter. Heterosis could not be excluded in 4 SNPs. Information on the genetic model of robustly replicated GWA-derived association signals may be useful for predictive modeling, and for designing biological and functional experiments.
doi:10.1093/aje/kwp145
PMCID: PMC2732984  PMID: 19602701
20.  Finding the missing heritability of complex diseases 
Nature  2009;461(7265):747-753.
Genome-wide association studies have identified hundreds of genetic variants associated with complex human diseases and traits, and have provided valuable insights into their genetic architecture. Most variants identified so far confer relatively small increments in risk, and explain only a small proportion of familial clustering, leading many to question how the remaining, ‘missing’ heritability can be explained. Here we examine potential sources of missing heritability and propose research strategies, including and extending beyond current genome-wide association approaches, to illuminate the genetics of complex diseases and enhance its potential to enable effective disease prevention or treatment.
doi:10.1038/nature08494
PMCID: PMC2831613  PMID: 19812666
21.  Comprehensive Association Study of Type 2 Diabetes and Related Quantitative Traits With 222 Candidate Genes 
Diabetes  2008;57(11):3136-3144.
OBJECTIVE—Type 2 diabetes is a common complex disorder with environmental and genetic components. We used a candidate gene–based approach to identify single nucleotide polymorphism (SNP) variants in 222 candidate genes that influence susceptibility to type 2 diabetes.
RESEARCH DESIGN AND METHODS—In a case-control study of 1,161 type 2 diabetic subjects and 1,174 control Finns who are normal glucose tolerant, we genotyped 3,531 tagSNPs and annotation-based SNPs and imputed an additional 7,498 SNPs, providing 99.9% coverage of common HapMap variants in the 222 candidate genes. Selected SNPs were genotyped in an additional 1,211 type 2 diabetic case subjects and 1,259 control subjects who are normal glucose tolerant, also from Finland.
RESULTS—Using SNP- and gene-based analysis methods, we replicated previously reported SNP-type 2 diabetes associations in PPARG, KCNJ11, and SLC2A2; identified significant SNPs in genes with previously reported associations (ENPP1 [rs2021966, P = 0.00026] and NRF1 [rs1882095, P = 0.00096]); and implicated novel genes, including RAPGEF1 (rs4740283, P = 0.00013) and TP53 (rs1042522, Arg72Pro, P = 0.00086), in type 2 diabetes susceptibility.
CONCLUSIONS—Our study provides an effective gene-based approach to association study design and analysis. One or more of the newly implicated genes may contribute to type 2 diabetes pathogenesis. Analysis of additional samples will be necessary to determine their effect on susceptibility.
doi:10.2337/db07-1731
PMCID: PMC2570412  PMID: 18678618
22.  Metabolic and cardiovascular traits: an abundance of recently identified common genetic variants 
Human Molecular Genetics  2008;17(R2):R102-R108.
Genome-wide association studies are providing new insights into the genetic basis of metabolic and cardiovascular traits. In the past 3 years, common variants in ∼50 loci have been strongly associated with metabolic and cardiovascular traits. Several of these loci have implicated genes without a previously known connection with metabolism. Further studies will be required to characterize the full impact of these loci on metabolism. Many of the identified loci include multiple independent variants that influence the same metabolic or cardiovascular trait and a few loci harbor independent variants that each influence distinct traits. The total proportion of trait heritability explained by variants identified so far is still modest (typically <10%). Future studies will build on these successes by identifying additional common and rare variants and by determining the functional impact of the underlying alleles and genes.
doi:10.1093/hmg/ddn275
PMCID: PMC2570060  PMID: 18852197
23.  Tissue-specific alternative splicing of TCF7L2 
Human Molecular Genetics  2009;18(20):3795-3804.
Common variants in the transcription factor 7-like 2 (TCF7L2) gene have been identified as the strongest genetic risk factors for type 2 diabetes (T2D). However, the mechanisms by which these non-coding variants increase risk for T2D are not well-established. We used 13 expression assays to survey mRNA expression of multiple TCF7L2 splicing forms in up to 380 samples from eight types of human tissue (pancreas, pancreatic islets, colon, liver, monocytes, skeletal muscle, subcutaneous adipose tissue and lymphoblastoid cell lines) and observed a tissue-specific pattern of alternative splicing. We tested whether the expression of TCF7L2 splicing forms was associated with single nucleotide polymorphisms (SNPs), rs7903146 and rs12255372, located within introns 3 and 4 of the gene and most strongly associated with T2D. Expression of two splicing forms was lower in pancreatic islets with increasing counts of T2D-associated alleles of the SNPs: a ubiquitous splicing form (P = 0.018 for rs7903146 and P = 0.020 for rs12255372) and a splicing form found in pancreatic islets, pancreas and colon but not in other tissues tested here (P = 0.009 for rs12255372 and P = 0.053 for rs7903146). Expression of this form in glucose-stimulated pancreatic islets correlated with expression of proinsulin (r2 = 0.84–0.90, P < 0.00063). In summary, we identified a tissue-specific pattern of alternative splicing of TCF7L2. After adjustment for multiple tests, no association between expression of TCF7L2 in eight types of human tissue samples and T2D-associated genetic variants remained significant. Alternative splicing of TCF7L2 in pancreatic islets warrants future studies. GenBank Accession Numbers: FJ010164–FJ010174.
doi:10.1093/hmg/ddp321
PMCID: PMC2748888  PMID: 19602480
24.  Identification of ten loci associated with height highlights new biological pathways in human growth 
Nature genetics  2008;40(5):584-591.
Height is a classic polygenic trait, reflecting the combined influence of multiple as-yet-undiscovered genetic factors. We carried out a meta-analysis of genome-wide association study data of height from 15,821 individuals at 2.2 million SNPs, and followed up the strongest findings in >10,000 subjects. Ten newly identified and two previously reported loci were strongly associated with variation in height (P values from 4 × 10-7 to 8 × 10-22). Together, these 12 loci account for ~2% of the population variation in height. Individuals with ≤8 height-increasing alleles and ≥16 height-increasing alleles differ in height by ~3.5 cm. The newly identified loci, along with several additional loci with strongly suggestive associations, encompass both strong biological candidates and unexpected genes, and highlight several pathways (let-7 targets, chromatin remodeling proteins and Hedgehog signaling) as important regulators of human stature. These results expand the picture of the biological regulation of human height and of the genetic architecture of this classical complex trait.
doi:10.1038/ng.125
PMCID: PMC2687076  PMID: 18391950
25.  Meta-Analysis of 23 Type 2 Diabetes Linkage Studies from the International Type 2 Diabetes Linkage Analysis Consortium 
Human Heredity  2007;66(1):35-49.
Background
The International Type 2 Diabetes Linkage Analysis Consortium was formed to localize type 2 diabetes predisposing variants based on 23 autosomal linkage scans.
Methods
We carried out meta-analysis using the genome scan meta-analysis (GSMA) method which divides the genome into bins of ∼30 cM, ranks the best linkage results in each bin for each sample, and then sums the ranks across samples. We repeated the meta-analysis using 2 cM bins, and/or replacing bin ranks with measures of linkage evidence: bin maximum LOD score or bin minimum p value for bins with p value <0.05 (truncated p value). We also carried out computer simulations to assess the empirical type I error rates of these meta-analysis methods.
Results
Our analyses provided modest evidence for type 2 diabetes-predisposing variants on chromosomes 4, 10, and 14 (using LOD scores or truncated p values), or chromosome 10 and 16 (using ranks). Our simulation results suggested that uneven marker density across studies results in substantial variation in empirical type I error rates for all meta-analysis methods, but that 2 cM bins and scores that make more explicit use of linkage evidence, especially the truncated p values, reduce this problem.
Conclusion
We identified regions modestly linked with type 2 diabetes by summarizing results from 23 autosomal genome scans.
doi:10.1159/000114164
PMCID: PMC2855874  PMID: 18223311
Gene mapping; Genetics; GSMA; Linkage analysis; Meta-analysis; Type 2 diabetes

Results 1-25 (27)