PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (42)
 

Clipboard (0)
None

Select a Filter Below

Year of Publication
1.  Phenotype–Genotype Integrator (PheGenI): synthesizing genome-wide association study (GWAS) data with existing genomic resources 
Rapidly accumulating data from genome-wide association studies (GWASs) and other large-scale studies are most useful when synthesized with existing databases. To address this opportunity, we developed the Phenotype–Genotype Integrator (PheGenI), a user-friendly web interface that integrates various National Center for Biotechnology Information (NCBI) genomic databases with association data from the National Human Genome Research Institute GWAS Catalog and supports downloads of search results. Here, we describe the rationale for and development of this resource. Integrating over 66 000 association records with extensive single nucleotide polymorphism (SNP), gene, and expression quantitative trait loci data already available from the NCBI, PheGenI enables deeper investigation and interrogation of SNPs associated with a wide range of traits, facilitating the examination of the relationships between genetic variation and human diseases.
doi:10.1038/ejhg.2013.96
PMCID: PMC3865418  PMID: 23695286
database; data integration; genome sequence; genome-wide association study; phenotype; single nucleotide polymorphism
3.  Prospective Associations of Coronary Heart Disease Loci in African Americans Using the MetaboChip: The PAGE Study 
PLoS ONE  2014;9(12):e113203.
Background
Coronary heart disease (CHD) is a leading cause of morbidity and mortality in African Americans. However, there is a paucity of studies assessing genetic determinants of CHD in African Americans. We examined the association of published variants in CHD loci with incident CHD, attempted to fine map these loci, and characterize novel variants influencing CHD risk in African Americans.
Methods and Results
Up to 8,201 African Americans (including 546 first CHD events) were genotyped using the MetaboChip array in the Atherosclerosis Risk in Communities (ARIC) study and Women's Health Initiative (WHI). We tested associations using Cox proportional hazard models in sex- and study-stratified analyses and combined results using meta-analysis. Among 44 validated CHD loci available in the array, we replicated and fine-mapped the SORT1 locus, and showed same direction of effects as reported in studies of individuals of European ancestry for SNPs in 22 additional published loci. We also identified a SNP achieving array wide significance (MYC: rs2070583, allele frequency 0.02, P = 8.1×10−8), but the association did not replicate in an additional 8,059 African Americans (577 events) from the WHI, HealthABC and GeneSTAR studies, and in a meta-analysis of 5 cohort studies of European ancestry (24,024 individuals including 1,570 cases of MI and 2,406 cases of CHD) from the CHARGE Consortium.
Conclusions
Our findings suggest that some CHD loci previously identified in individuals of European ancestry may be relevant to incident CHD in African Americans.
doi:10.1371/journal.pone.0113203
PMCID: PMC4277270  PMID: 25542012
4.  No evidence of interaction between known lipid-associated genetic variants and smoking in the multi-ethnic PAGE population 
Human genetics  2013;132(12):1427-1431.
Genome-wide association studies (GWAS) have identified many variants that influence high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, and/or triglycerides. However, environmental modifiers, such as smoking, of these known genotype–phenotype associations are just recently emerging in the literature. We have tested for interactions between smoking and 49 GWAS-identified variants in over 41,000 racially/ethnically diverse samples with lipid levels from the Population Architecture Using Genomics and Epidemiology (PAGE) study. Despite their biological plausibility, we were unable to detect significant SNP × smoking interactions.
doi:10.1007/s00439-013-1375-3
PMCID: PMC3895337  PMID: 24100633
5.  Imputation of coding variants in African Americans: better performance using data from the exome sequencing project 
Bioinformatics  2013;29(21):2744-2749.
Summary: Although the 1000 Genomes haplotypes are the most commonly used reference panel for imputation, medical sequencing projects are generating large alternate sets of sequenced samples. Imputation in African Americans using 3384 haplotypes from the Exome Sequencing Project, compared with 2184 haplotypes from 1000 Genomes Project, increased effective sample size by 8.3–11.4% for coding variants with minor allele frequency <1%. No loss of imputation quality was observed using a panel built from phenotypic extremes. We recommend using haplotypes from Exome Sequencing Project alone or concatenation of the two panels over quality score-based post-imputation selection or IMPUTE2’s two-panel combination.
Contact: yunli@med.unc.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btt477
PMCID: PMC3799474  PMID: 23956302
6.  Association of the FTO Obesity Risk Variant rs8050136 With Percentage of Energy Intake From Fat in Multiple Racial/Ethnic Populations 
American Journal of Epidemiology  2013;178(5):780-790.
Common obesity risk variants have been associated with macronutrient intake; however, these associations' generalizability across populations has not been demonstrated. We investigated the associations between 6 obesity risk variants in (or near) the NEGR1, TMEM18, BDNF, FTO, MC4R, and KCTD15 genes and macronutrient intake (carbohydrate, protein, ethanol, and fat) in 3 Population Architecture using Genomics and Epidemiology (PAGE) studies: the Multiethnic Cohort Study (1993–2006) (n = 19,529), the Atherosclerosis Risk in Communities Study (1987–1989) (n = 11,114), and the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) Study, which accesses data from the Third National Health and Nutrition Examination Survey (1991–1994) (n = 6,347). We used linear regression, with adjustment for age, sex, and ethnicity, to estimate the associations between obesity risk genotypes and macronutrient intake. A fixed-effects meta-analysis model showed that the FTO rs8050136 A allele (n = 36,973) was positively associated with percentage of calories derived from fat (βmeta = 0.2244 (standard error, 0.0548); P = 4 × 10−5) and inversely associated with percentage of calories derived from carbohydrate (βmeta = −0.2796 (standard error, 0.0709); P = 8 × 10−5). In the Multiethnic Cohort Study, percentage of calories from fat assessed at baseline was a partial mediator of the rs8050136 effect on body mass index (weight (kg)/height (m)2) obtained at 10 years of follow-up (mediation of effect = 0.0823 kg/m2, 95% confidence interval: 0.0559, 0.1128). Our data provide additional evidence that the association of FTO with obesity is partially mediated by dietary intake.
doi:10.1093/aje/kwt028
PMCID: PMC3755639  PMID: 23820787
energy intake; fat mass and obesity-associated (FTO) gene; obesity; percent calories from fat; race/ethnicity
7.  Post genome-wide association study challenges for lipid traits: describing age as a modifier of gene-lipid associations in the Population Architecture using Genomics and Epidemiology (PAGE) study 
Annals of human genetics  2013;77(5):416-425.
Summary
Numerous common genetic variants that influence plasma high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), and triglyceride (TG) distributions have been identified via genome-wide association studies (GWAS). However, whether or not these associations are age dependent has largely been overlooked. We conducted an association study and meta-analysis in more than 22,000 European Americans between 49 previously identified GWAS variants and the three lipid traits, stratified by age (males: <50 or ≥50 years of age; females: pre- or post-menopausal). For each variant, a test of heterogeneity was performed between the two age strata and significant Phet values were used as evidence of age-specific genetic effects. We identified seven associations in females and eight in males that displayed suggestive heterogeneity by age (Phet<0.05). The association between rs174547 (FADS1) and LDL-C in males displayed the most evidence for heterogeneity between age groups (Phet=1.74E-03, I2=89.8), with a significant association in older males (P=1.39E-06) but not younger males (P=0.99). However, none of the suggestive modifying effects survived adjustment for multiple testing, highlighting the challenges of identifying modifiers of modest SNP-trait associations despite large sample sizes.
doi:10.1111/ahg.12027
PMCID: PMC3796061  PMID: 23808484
PAGE; modifier; age; lipids; genetic association
8.  Phenome-wide association studies demonstrating pleiotropy of genetic variants within FTO with and without adjustment for body mass index 
Frontiers in Genetics  2014;5:250.
Phenome-wide association studies (PheWAS) have demonstrated utility in validating genetic associations derived from traditional genetic studies as well as identifying novel genetic associations. Here we used an electronic health record (EHR)-based PheWAS to explore pleiotropy of genetic variants in the fat mass and obesity associated gene (FTO), some of which have been previously associated with obesity and type 2 diabetes (T2D). We used a population of 10,487 individuals of European ancestry with genome-wide genotyping from the Electronic Medical Records and Genomics (eMERGE) Network and another population of 13,711 individuals of European ancestry from the BioVU DNA biobank at Vanderbilt genotyped using Illumina HumanExome BeadChip. A meta-analysis of the two study populations replicated the well-described associations between FTO variants and obesity (odds ratio [OR] = 1.25, 95% Confidence Interval = 1.11–1.24, p = 2.10 × 10−9) and FTO variants and T2D (OR = 1.14, 95% CI = 1.08–1.21, p = 2.34 × 10−6). The meta-analysis also demonstrated that FTO variant rs8050136 was significantly associated with sleep apnea (OR = 1.14, 95% CI = 1.07–1.22, p = 3.33 × 10−5); however, the association was attenuated after adjustment for body mass index (BMI). Novel phenotype associations with obesity-associated FTO variants included fibrocystic breast disease (rs9941349, OR = 0.81, 95% CI = 0.74–0.91, p = 5.41 × 10−5) and trends toward associations with non-alcoholic liver disease and gram-positive bacterial infections. FTO variants not associated with obesity demonstrated other potential disease associations including non-inflammatory disorders of the cervix and chronic periodontitis. These results suggest that genetic variants in FTO may have pleiotropic associations, some of which are not mediated by obesity.
doi:10.3389/fgene.2014.00250
PMCID: PMC4134007  PMID: 25177340
PheWAS; genetic association; pleiotropy; Exome chip; FTO; BMI
9.  The Influence of Obesity-Related Single Nucleotide Polymorphisms on BMI Across the Life Course 
Diabetes  2013;62(5):1763-1767.
Evidence is limited as to whether heritable risk of obesity varies throughout adulthood. Among >34,000 European Americans, aged 18–100 years, from multiple U.S. studies in the Population Architecture using Genomics and Epidemiology (PAGE) Consortium, we examined evidence for heterogeneity in the associations of five established obesity risk variants (near FTO, GNPDA2, MTCH2, TMEM18, and NEGR1) with BMI across four distinct epochs of adulthood: 1) young adulthood (ages 18–25 years), adulthood (ages 26–49 years), middle-age adulthood (ages 50–69 years), and older adulthood (ages ≥70 years); or 2) by menopausal status in women and stratification by age 50 years in men. Summary-effect estimates from each meta-analysis were compared for heterogeneity across the life epochs. We found heterogeneity in the association of the FTO (rs8050136) variant with BMI across the four adulthood epochs (P = 0.0006), with larger effects in young adults relative to older adults (β [SE] = 1.17 [0.45] vs. 0.09 [0.09] kg/m2, respectively, per A allele) and smaller intermediate effects. We found no evidence for heterogeneity in the association of GNPDA2, MTCH2, TMEM18, and NEGR1 with BMI across adulthood. Genetic predisposition to obesity may have greater effects on body weight in young compared with older adulthood for FTO, suggesting changes by age, generation, or secular trends. Future research should compare and contrast our findings with results using longitudinal data.
doi:10.2337/db12-0863
PMCID: PMC3636619  PMID: 23300277
10.  WikiGWA: an open platform for collecting and using genome-wide association results 
The number of discovered genetic variants from genome-wide association (GWA) studies (GWAS) has been growing rapidly. Centralized efforts such as the National Human Genome Research Institute's GWAS catalog provide regular updates and a convenient interface for quick lookup. However, the catalog entries are manually curated and rely on data from published articles. Other tools such as SNPedia (http://www.snpedia.com) collect published results regarding functional consequences of genetic variations. Here, we propose an approach that allows individual investigators to share their GWA results through an open platform. Unlike GWAS catalog or SNPedia, wikiGWA collects first-hand GWAS results and in a much larger scale. Investigators are not only able to post a much larger amount of results, but also post results from unpublished studies, which could alleviate publication bias and facilitate identification of weak signals. Our interface allows for flexible and fast queries, and the query results are formatted to work seamlessly with the LocusZoom program for visualization and annotation. We here describe wikiGWA, made publically available at http://www.wikiGWA.org.
doi:10.1038/ejhg.2012.187
PMCID: PMC3598322  PMID: 22929026
genome-wide association; open platform; bioinformatics
11.  Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data 
Nature biotechnology  2013;31(12):1102-1110.
Candidate gene and genome-wide association studies (GWAS) have identified genetic variants that modulate risk for human disease; many of these associations require further study to replicate the results. Here we report the first large-scale application of the phenome-wide association study (PheWAS) paradigm within electronic medical records (EMRs), an unbiased approach to replication and discovery that interrogates relationships between targeted genotypes and multiple phenotypes. We scanned for associations between 3,144 single-nucleotide polymorphisms (previously implicated by GWAS as mediators of human traits) and 1,358 EMR-derived phenotypes in 13,835 individuals of European ancestry. This PheWAS replicated 66% (51/77) of sufficiently powered prior GWAS associations and revealed 63 potentially pleiotropic associations with P < 4.6 × 10−6 (false discovery rate < 0.1); the strongest of these novel associations were replicated in an independent cohort (n = 7,406). These findings validate PheWAS as a tool to allow unbiased interrogation across multiple phenotypes in EMR-based cohorts and to enhance analysis of the genomic basis of human disease.
doi:10.1038/nbt.2749
PMCID: PMC3969265  PMID: 24270849
12.  Pleiotropic Associations of Risk Variants Identified for Other Cancers With Lung Cancer Risk: The PAGE and TRICL Consortia 
Background
Genome-wide association studies have identified hundreds of genetic variants associated with specific cancers. A few of these risk regions have been associated with more than one cancer site; however, a systematic evaluation of the associations between risk variants for other cancers and lung cancer risk has yet to be performed.
Methods
We included 18023 patients with lung cancer and 60543 control subjects from two consortia, Population Architecture using Genomics and Epidemiology (PAGE) and Transdisciplinary Research in Cancer of the Lung (TRICL). We examined 165 single-nucleotide polymorphisms (SNPs) that were previously associated with at least one of 16 non–lung cancer sites. Study-specific logistic regression results underwent meta-analysis, and associations were also examined by race/ethnicity, histological cell type, sex, and smoking status. A Bonferroni-corrected P value of 2.5×10–5 was used to assign statistical significance.
Results
The breast cancer SNP LSP1 rs3817198 was associated with an increased risk of lung cancer (odds ratio [OR] = 1.10; 95% confidence interval [CI] = 1.05 to 1.14; P = 2.8×10–6). This association was strongest for women with adenocarcinoma (P = 1.2×10–4) and not statistically significant in men (P = .14) with this cell type (P het by sex = .10). Two glioma risk variants, TERT rs2853676 and CDKN2BAS1 rs4977756, which are located in regions previously associated with lung cancer, were associated with increased risk of adenocarcinoma (OR = 1.16; 95% CI = 1.10 to 1.22; P = 1.1×10–8) and squamous cell carcinoma (OR = 1.13; CI = 1.07 to 1.19; P = 2.5×10–5), respectively.
Conclusions
Our findings demonstrate a novel pleiotropic association between the breast cancer LSP1 risk region marked by variant rs3817198 and lung cancer risk.
doi:10.1093/jnci/dju061
PMCID: PMC3982896  PMID: 24681604
13.  A survey of informatics approaches to whole-exome and whole-genome clinical reporting in the electronic health record 
Purpose
Genome-scale clinical sequencing is being adopted more broadly in medical practice. The National Institutes of Health developed the Clinical Sequencing Exploratory Research (CSER) program to guide implementation and dissemination of best practices for the integration of sequencing into clinical care. This study describes and compares the state of the art of incorporating whole-exome and whole-genome sequencing results into the electronic health record, including approaches to decision support across the six current CSER sites.
Methods
The CSER Medical Record Working Group collaboratively developed and completed an in-depth survey to assess the communication of genome-scale data into the electronic health record. We summarized commonalities and divergent approaches.
Results
Despite common sequencing platform (Illumina) adoptions, there is a great diversity of approaches to annotation tools and workflow, as well as to report generation. At all sites, reports are human-readable structured documents available as passive decision support in the electronic health record. Active decision support is in early implementation at two sites.
Conclusion
The parallel efforts across CSER sites in the creation of systems for report generation and integration of reports into the electronic health record, as well as the lack of standardized approaches to interfacing with variant databases to create active clinical decision support, create opportunities for cross-site and vendor collaborations.
doi:10.1038/gim.2013.120
PMCID: PMC3951437  PMID: 24071794
clinical decision support; clinical sequencing; decision support rules; electronic health record; electronic medical record; next-generation sequencing
14.  Pleiotropy of Cancer Susceptibility Variants on the Risk of Non-Hodgkin Lymphoma: The PAGE Consortium 
PLoS ONE  2014;9(3):e89791.
Background
Risk of non-Hodgkin lymphoma (NHL) is higher among individuals with a family history or a prior diagnosis of other cancers. Genome-wide association studies (GWAS) have suggested that some genetic susceptibility variants are associated with multiple complex traits (pleiotropy).
Objective
We investigated whether common risk variants identified in cancer GWAS may also increase the risk of developing NHL as the first primary cancer.
Methods
As part of the Population Architecture using Genomics and Epidemiology (PAGE) consortium, 113 cancer risk variants were analyzed in 1,441 NHL cases and 24,183 controls from three studies (BioVU, Multiethnic Cohort Study, Women's Health Initiative) for their association with the risk of overall NHL and common subtypes [diffuse large B-cell lymphoma (DLBCL), follicular lymphoma (FL), chronic lymphocytic leukemia or small lymphocytic lymphoma (CLL/SLL)] using an additive genetic model adjusted for age, sex and ethnicity. Study-specific results for each variant were meta-analyzed across studies.
Results
The analysis of NHL subtype-specific GWAS SNPs and overall NHL suggested a shared genetic susceptibility between FL and DLBCL, particularly involving variants in the major histocompatibility complex region (rs6457327 in 6p21.33: FL OR = 1.29, p = 0.013; DLBCL OR = 1.23, p = 0.013; NHL OR = 1.22, p = 5.9×E-05). In the pleiotropy analysis, six risk variants for other cancers were associated with NHL risk, including variants for lung (rs401681 in TERT: OR per C allele = 0.89, p = 3.7×E-03; rs4975616 in TERT: OR per A allele = 0.90, p = 0.01; rs3131379 in MSH5: OR per T allele = 1.16, p = 0.03), prostate (rs7679673 in TET2: OR per C allele = 0.89, p = 5.7×E-03; rs10993994 in MSMB: OR per T allele = 1.09, p = 0.04), and breast (rs3817198 in LSP1: OR per C allele = 1.12, p = 0.01) cancers, but none of these associations remained significant after multiple test correction.
Conclusion
This study does not support strong pleiotropic effects of non-NHL cancer risk variants in NHL etiology; however, larger studies are warranted.
doi:10.1371/journal.pone.0089791
PMCID: PMC3943855  PMID: 24598796
15.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations 
Nucleic Acids Research  2013;42(Database issue):D1001-D1006.
The National Human Genome Research Institute (NHGRI) Catalog of Published Genome-Wide Association Studies (GWAS) Catalog provides a publicly available manually curated collection of published GWAS assaying at least 100 000 single-nucleotide polymorphisms (SNPs) and all SNP-trait associations with P <1 × 10−5. The Catalog includes 1751 curated publications of 11 912 SNPs. In addition to the SNP-trait association data, the Catalog also publishes a quarterly diagram of all SNP-trait associations mapped to the SNPs’ chromosomal locations. The Catalog can be accessed via a tabular web interface, via a dynamic visualization on the human karyotype, as a downloadable tab-delimited file and as an OWL knowledge base. This article presents a number of recent improvements to the Catalog, including novel ways for users to interact with the Catalog and changes to the curation infrastructure.
doi:10.1093/nar/gkt1229
PMCID: PMC3965119  PMID: 24316577
16.  Genetic variants associated with fasting glucose and insulin concentrations in an ethnically diverse population: results from the Population Architecture using Genomics and Epidemiology (PAGE) study 
BMC Medical Genetics  2013;14:98.
Background
Multiple genome-wide association studies (GWAS) within European populations have implicated common genetic variants associated with insulin and glucose concentrations. In contrast, few studies have been conducted within minority groups, which carry the highest burden of impaired glucose homeostasis and type 2 diabetes in the U.S.
Methods
As part of the 'Population Architecture using Genomics and Epidemiology (PAGE) Consortium, we investigated the association of up to 10 GWAS-identified single nucleotide polymorphisms (SNPs) in 8 genetic regions with glucose or insulin concentrations in up to 36,579 non-diabetic subjects including 23,323 European Americans (EA) and 7,526 African Americans (AA), 3,140 Hispanics, 1,779 American Indians (AI), and 811 Asians. We estimated the association between each SNP and fasting glucose or log-transformed fasting insulin, followed by meta-analysis to combine results across PAGE sites.
Results
Overall, our results show that 9/9 GWAS SNPs are associated with glucose in EA (p = 0.04 to 9 × 10-15), versus 3/9 in AA (p= 0.03 to 6 × 10-5), 3/4 SNPs in Hispanics, 2/4 SNPs in AI, and 1/2 SNPs in Asians. For insulin we observed a significant association with rs780094/GCKR in EA, Hispanics and AI only.
Conclusions
Generalization of results across multiple racial/ethnic groups helps confirm the relevance of some of these loci for glucose and insulin metabolism. Lack of association in non-EA groups may be due to insufficient power, or to unique patterns of linkage disequilibrium.
doi:10.1186/1471-2350-14-98
PMCID: PMC3849560  PMID: 24063630
17.  Generalization and Dilution of Association Results from European GWAS in Populations of Non-European Ancestry: The PAGE Study 
PLoS Biology  2013;11(9):e1001661.
A multi-ethnic study demonstrates that the extrapolation of genetic disease risk models from European populations to other ethnicities is compromised more strongly by genetic structure than by environmental or global genetic background in differential genetic risk associations across ethnicities.
The vast majority of genome-wide association study (GWAS) findings reported to date are from populations with European Ancestry (EA), and it is not yet clear how broadly the genetic associations described will generalize to populations of diverse ancestry. The Population Architecture Using Genomics and Epidemiology (PAGE) study is a consortium of multi-ancestry, population-based studies formed with the objective of refining our understanding of the genetic architecture of common traits emerging from GWAS. In the present analysis of five common diseases and traits, including body mass index, type 2 diabetes, and lipid levels, we compare direction and magnitude of effects for GWAS-identified variants in multiple non-EA populations against EA findings. We demonstrate that, in all populations analyzed, a significant majority of GWAS-identified variants have allelic associations in the same direction as in EA, with none showing a statistically significant effect in the opposite direction, after adjustment for multiple testing. However, 25% of tagSNPs identified in EA GWAS have significantly different effect sizes in at least one non-EA population, and these differential effects were most frequent in African Americans where all differential effects were diluted toward the null. We demonstrate that differential LD between tagSNPs and functional variants within populations contributes significantly to dilute effect sizes in this population. Although most variants identified from GWAS in EA populations generalize to all non-EA populations assessed, genetic models derived from GWAS findings in EA may generate spurious results in non-EA populations due to differential effect sizes. Regardless of the origin of the differential effects, caution should be exercised in applying any genetic risk prediction model based on tagSNPs outside of the ancestry group in which it was derived. Models based directly on functional variation may generalize more robustly, but the identification of functional variants remains challenging.
Author Summary
The number of known associations between human diseases and common genetic variants has grown dramatically in the past decade, most being identified in large-scale genetic studies of people of Western European origin. But because the frequencies of genetic variants can differ substantially between continental populations, it's important to assess how well these associations can be extended to populations with different continental ancestry. Are the correlations between genetic variants, disease endpoints, and risk factors consistent enough for genetic risk models to be reliably applied across different ancestries? Here we describe a systematic analysis of disease outcome and risk-factor–associated variants (tagSNPs) identified in European populations, in which we test whether the effect size of a tagSNP is consistent across six populations with significant non-European ancestry. We demonstrate that although nearly all such tagSNPs have effects in the same direction across all ancestries (i.e., variants associated with higher risk in Europeans will also be associated with higher risk in other populations), roughly a quarter of the variants tested have significantly different magnitude of effect (usually lower) in at least one non-European population. We therefore advise caution in the use of tagSNP-based genetic disease risk models in populations that have a different genetic ancestry from the population in which original associations were first made. We then show that this differential strength of association can be attributed to population-dependent variations in the correlation between tagSNPs and the variant that actually determines risk—the so-called functional variant. Risk models based on functional variants are therefore likely to be more robust than tagSNP-based models.
doi:10.1371/journal.pbio.1001661
PMCID: PMC3775722  PMID: 24068893
18.  Consistent Directions of Effect for Established Type 2 Diabetes Risk Variants Across Populations 
Diabetes  2012;61(6):1642-1647.
Common genetic risk variants for type 2 diabetes (T2D) have primarily been identified in populations of European and Asian ancestry. We tested whether the direction of association with 20 T2D risk variants generalizes across six major racial/ethnic groups in the U.S. as part of the Population Architecture using Genomics and Epidemiology Consortium (16,235 diabetes case and 46,122 control subjects of European American, African American, Hispanic, East Asian, American Indian, and Native Hawaiian ancestry). The percentage of positive (odds ratio [OR] >1 for putative risk allele) associations ranged from 69% in American Indians to 100% in European Americans. Of the nine variants where we observed significant heterogeneity of effect by racial/ethnic group (Pheterogeneity < 0.05), eight were positively associated with risk (OR >1) in at least five groups. The marked directional consistency of association observed for most genetic variants across populations implies a shared functional common variant in each region. Fine-mapping of all loci will be required to reveal markers of risk that are important within and across populations.
doi:10.2337/db11-1296
PMCID: PMC3357304  PMID: 22474029
19.  Investigation of gene-by-sex interactions for lipid traits in diverse populations from the population architecture using genomics and epidemiology study 
BMC Genetics  2013;14:33.
Background
High-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), and triglyceride (TG) levels are influenced by both genes and the environment. Genome-wide association studies (GWAS) have identified ~100 common genetic variants associated with HDL-C, LDL-C, and/or TG levels, mostly in populations of European descent, but little is known about the modifiers of these associations. Here, we investigated whether GWAS-identified SNPs for lipid traits exhibited heterogeneity by sex in the Population Architecture using Genomics and Epidemiology (PAGE) study.
Results
A sex-stratified meta-analysis was performed for 49 GWAS-identified SNPs for fasting HDL-C, LDL-C, and ln(TG) levels among adults self-identified as European American (25,013). Heterogeneity by sex was established when phet < 0.001. There was evidence for heterogeneity by sex for two SNPs for ln(TG) in the APOA1/C3/A4/A5/BUD13 gene cluster: rs28927680 (phet = 7.4x10-7) and rs3135506 (phet = 4.3x10-4), one SNP in PLTP for HDL levels (rs7679; phet = 9.9x10-4), and one in HMGCR for LDL levels (rs12654264; phet = 3.1x10-5). We replicated heterogeneity by sex in five of seventeen loci previously reported by genome-wide studies (binomial p = 0.0009). We also present results for other racial/ethnic groups in the supplementary materials, to provide a resource for future meta-analyses.
Conclusions
We provide further evidence for sex-specific effects of SNPs in the APOA1/C3/A4/A5/BUD13 gene cluster, PLTP, and HMGCR on fasting triglyceride levels in European Americans from the PAGE study. Our findings emphasize the need for considering context-specific effects when interpreting genetic associations emerging from GWAS, and also highlight the difficulties in replicating interaction effects across studies and across racial/ethnic groups.
doi:10.1186/1471-2156-14-33
PMCID: PMC3669109  PMID: 23634756
Lipids; Genetics; Cardiovascular disease; Heterogeneity; Sex-specific effect; Association study
20.  Associations Between Incident Ischemic Stroke Events and Stroke and Cardiovascular Disease-Related GWAS SNPs in the Population Architecture Using Genomics and Epidemiology (PAGE) Study 
Background
Genome-wide association studies (GWAS) have identified loci associated with ischemic stroke (IS) and cardiovascular disease (CVD) in European-descent individuals, but their replication in different populations has been largely unexplored.
Methods and Results
Nine single-nucleotide polymorphisms (SNPs) selected from GWAS and meta-analyses of stroke and 86 SNPs previously associated with myocardial infarction and CVD risk factors including blood lipids (HDL, LDL, triglycerides), type 2 diabetes and body mass index were investigated for associations with incident IS in European Americans (EA) N=26,276; African Americans (AA) N=8970; and American Indians (AI) N= 3570 from the Population Architecture using Genomics and Epidemiology Study. Ancestry-specific fixed effects meta-analysis with inverse variance weighting was used to combine study-specific log hazard ratios from Cox proportional hazards models. Two of 9 stroke SNPs (rs783396 and rs1804689) were associated with increased IS hazard in AA; none were significant in this large EA cohort. Of 73 CVD risk factor SNPs tested in EA, two (HDL and triglycerides SNPs) were associated with IS. In AA, SNPs associated with LDL, HDL and BMI were significantly associated with IS (3 of 86 SNPs tested). Out of 58 SNPs tested in AI, one LDL SNP was significantly associated with IS.
Conclusions
Our analyses showing lack of replication in spite of reasonable power for many stroke SNPs and differing results by ancestry highlight the need to follow-up on GWAS findings and conduct genetic association studies in diverse populations. We found modest IS associations with BMI and lipids SNPs, though these findings require confirmation.
doi:10.1161/CIRCGENETICS.111.962191
PMCID: PMC3402178  PMID: 22403240
genetics of stroke; risk factors for stroke; genetics of cardiovascular disease; epidemiology
21.  Trans-Ethnic Fine-Mapping of Lipid Loci Identifies Population-Specific Signals and Allelic Heterogeneity That Increases the Trait Variance Explained 
Wu, Ying | Waite, Lindsay L. | Jackson, Anne U. | Sheu, Wayne H-H. | Buyske, Steven | Absher, Devin | Arnett, Donna K. | Boerwinkle, Eric | Bonnycastle, Lori L. | Carty, Cara L. | Cheng, Iona | Cochran, Barbara | Croteau-Chonka, Damien C. | Dumitrescu, Logan | Eaton, Charles B. | Franceschini, Nora | Guo, Xiuqing | Henderson, Brian E. | Hindorff, Lucia A. | Kim, Eric | Kinnunen, Leena | Komulainen, Pirjo | Lee, Wen-Jane | Le Marchand, Loic | Lin, Yi | Lindström, Jaana | Lingaas-Holmen, Oddgeir | Mitchell, Sabrina L. | Narisu, Narisu | Robinson, Jennifer G. | Schumacher, Fred | Stančáková, Alena | Sundvall, Jouko | Sung, Yun-Ju | Swift, Amy J. | Wang, Wen-Chang | Wilkens, Lynne | Wilsgaard, Tom | Young, Alicia M. | Adair, Linda S. | Ballantyne, Christie M. | Bůžková, Petra | Chakravarti, Aravinda | Collins, Francis S. | Duggan, David | Feranil, Alan B. | Ho, Low-Tone | Hung, Yi-Jen | Hunt, Steven C. | Hveem, Kristian | Juang, Jyh-Ming J. | Kesäniemi, Antero Y. | Kuusisto, Johanna | Laakso, Markku | Lakka, Timo A. | Lee, I-Te | Leppert, Mark F. | Matise, Tara C. | Moilanen, Leena | Njølstad, Inger | Peters, Ulrike | Quertermous, Thomas | Rauramaa, Rainer | Rotter, Jerome I. | Saramies, Jouko | Tuomilehto, Jaakko | Uusitupa, Matti | Wang, Tzung-Dau | Boehnke, Michael | Haiman, Christopher A. | Chen, Yii-Der I. | Kooperberg, Charles | Assimes, Themistocles L. | Crawford, Dana C. | Hsiung, Chao A. | North, Kari E. | Mohlke, Karen L.
PLoS Genetics  2013;9(3):e1003379.
Genome-wide association studies (GWAS) have identified ∼100 loci associated with blood lipid levels, but much of the trait heritability remains unexplained, and at most loci the identities of the trait-influencing variants remain unknown. We conducted a trans-ethnic fine-mapping study at 18, 22, and 18 GWAS loci on the Metabochip for their association with triglycerides (TG), high-density lipoprotein cholesterol (HDL-C), and low-density lipoprotein cholesterol (LDL-C), respectively, in individuals of African American (n = 6,832), East Asian (n = 9,449), and European (n = 10,829) ancestry. We aimed to identify the variants with strongest association at each locus, identify additional and population-specific signals, refine association signals, and assess the relative significance of previously described functional variants. Among the 58 loci, 33 exhibited evidence of association at P<1×10−4 in at least one ancestry group. Sequential conditional analyses revealed that ten, nine, and four loci in African Americans, Europeans, and East Asians, respectively, exhibited two or more signals. At these loci, accounting for all signals led to a 1.3- to 1.8-fold increase in the explained phenotypic variance compared to the strongest signals. Distinct signals across ancestry groups were identified at PCSK9 and APOA5. Trans-ethnic analyses narrowed the signals to smaller sets of variants at GCKR, PPP1R3B, ABO, LCAT, and ABCA1. Of 27 variants reported previously to have functional effects, 74% exhibited the strongest association at the respective signal. In conclusion, trans-ethnic high-density genotyping and analysis confirm the presence of allelic heterogeneity, allow the identification of population-specific variants, and limit the number of candidate SNPs for functional studies.
Author Summary
Lipid traits are heritable, but many of the DNA variants that influence lipid levels remain unknown. In a genomic region, more than one variant may affect gene expression or function, and the frequencies of these variants can differ across populations. Genotyping densely spaced variants in individuals with different ancestries may increase the chance of identifying variants that affect gene expression or function. We analyzed high-density genotyped variants for association with TG, HDL-C, and LDL-C in African Americans, East Asians, and Europeans. At several genomic regions, we provide evidence that two or more variants can influence lipid traits; across loci, these additional signals increase the proportion of trait variation that can be explained by genes. At some association signals shared across populations, combining data from individuals of different ancestries narrowed the set of likely functional variants. At PCSK9 and APOA5, the data suggest that different variants influence trait levels in different populations. Variants previously reported to alter gene expression or function frequently exhibited the strongest association at those signals. The multiple signals and population-specific characteristics of the loci described here may be shared by genetic loci for other complex traits.
doi:10.1371/journal.pgen.1003379
PMCID: PMC3605054  PMID: 23555291
22.  Genetic Variation and Reproductive Timing: African American Women from the Population Architecture Using Genomics and Epidemiology (PAGE) Study 
PLoS ONE  2013;8(2):e55258.
Age at menarche (AM) and age at natural menopause (ANM) define the boundaries of the reproductive lifespan in women. Their timing is associated with various diseases, including cancer and cardiovascular disease. Genome-wide association studies have identified several genetic variants associated with either AM or ANM in populations of largely European or Asian descent women. The extent to which these associations generalize to diverse populations remains unknown. Therefore, we sought to replicate previously reported AM and ANM findings and to identify novel AM and ANM variants using the Metabochip (n = 161,098 SNPs) in 4,159 and 1,860 African American women, respectively, in the Women’s Health Initiative (WHI) and Atherosclerosis Risk in Communities (ARIC) studies, as part of the Population Architecture using Genomics and Epidemiology (PAGE) Study. We replicated or generalized one previously identified variant for AM, rs1361108/CENPW, and two variants for ANM, rs897798/BRSK1 and rs769450/APOE, to our African American cohort. Overall, generalization of the majority of previously-identified variants for AM and ANM, including LIN28B and MCM8, was not observed in this African American sample. We identified three novel loci associated with ANM that reached significance after multiple testing correction (LDLR rs189596789, p = 5×10−08; KCNQ1 rs79972789, p = 1.9×10−07; COL4A3BP rs181686584, p = 2.9×10−07). Our most significant AM association was upstream of RSF1, a gene implicated in ovarian and breast cancers (rs11604207, p = 1.6×10−06). While most associations were identified in either AM or ANM, we did identify genes suggestively associated with both: PHACTR1 and ARHGAP42. The lack of generalization coupled with the potentially novel associations identified here emphasize the need for additional genetic discovery efforts for AM and ANM in diverse populations.
doi:10.1371/journal.pone.0055258
PMCID: PMC3570525  PMID: 23424626
23.  Genotype Imputation of Metabochip SNPs Using a Study-Specific Reference Panel of ~4,000 Haplotypes in African Americans From the Women’s Health Initiative 
Genetic epidemiology  2012;36(2):107-117.
Genetic imputation has become standard practice in modern genetic studies. However, several important issues have not been adequately addressed including the utility of study-specific reference, performance in admixed populations, and quality for less common (minor allele frequency [MAF] 0.005–0.05) and rare (MAF < 0.005) variants. These issues only recently became addressable with genome-wide association studies (GWAS) follow-up studies using dense genotyping or sequencing in large samples of non-European individuals. In this work, we constructed a study-specific reference panel of 3,924 haplotypes using African Americans in the Women’s Health Initiative (WHI) genotyped on both the Metabochip and the Affymetrix 6.0 GWAS platform. We used this reference panel to impute into 6,459 WHI SNP Health Association Resource (SHARe) study subjects with only GWAS genotypes. Our analysis confirmed the imputation quality metric Rsq (estimated r2, specific to each SNP) as an effective post-imputation filter. We recommend different Rsq thresholds for different MAF categories such that the average (across SNPs) Rsq is above the desired dosage r2 (squared Pearson correlation between imputed and experimental genotypes).With a desired dosage r2 of 80%, 99.9% (97.5%, 83.6%, 52.0%, 20.5%) of SNPs with MAF > 0.05 (0.03–0.05, 0.01–0.03, 0.005–0.01, and 0.001–0.005) passed the post-imputation filter. The average dosage r2 for these SNPs is 94.7%, 92.1%, 89.0%, 83.1%, and 79.7%, respectively. These results suggest that for African Americans imputation of Metabochip SNPs from GWAS data, including low frequency SNPs with MAF 0.005–0.05, is feasible and worthwhile for power increase in downstream association analysis provided a sizable reference panel is available.
doi:10.1002/gepi.21603
PMCID: PMC3410659  PMID: 22851474
genotype imputation; Metabochip; internal reference; African Americans; rare variants
24.  Phenome-Wide Association Study (PheWAS) for Detection of Pleiotropy within the Population Architecture using Genomics and Epidemiology (PAGE) Network 
PLoS Genetics  2013;9(1):e1003087.
Using a phenome-wide association study (PheWAS) approach, we comprehensively tested genetic variants for association with phenotypes available for 70,061 study participants in the Population Architecture using Genomics and Epidemiology (PAGE) network. Our aim was to better characterize the genetic architecture of complex traits and identify novel pleiotropic relationships. This PheWAS drew on five population-based studies representing four major racial/ethnic groups (European Americans (EA), African Americans (AA), Hispanics/Mexican-Americans, and Asian/Pacific Islanders) in PAGE, each site with measurements for multiple traits, associated laboratory measures, and intermediate biomarkers. A total of 83 single nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) were genotyped across two or more PAGE study sites. Comprehensive tests of association, stratified by race/ethnicity, were performed, encompassing 4,706 phenotypes mapped to 105 phenotype-classes, and association results were compared across study sites. A total of 111 PheWAS results had significant associations for two or more PAGE study sites with consistent direction of effect with a significance threshold of p<0.01 for the same racial/ethnic group, SNP, and phenotype-class. Among results identified for SNPs previously associated with phenotypes such as lipid traits, type 2 diabetes, and body mass index, 52 replicated previously published genotype–phenotype associations, 26 represented phenotypes closely related to previously known genotype–phenotype associations, and 33 represented potentially novel genotype–phenotype associations with pleiotropic effects. The majority of the potentially novel results were for single PheWAS phenotype-classes, for example, for CDKN2A/B rs1333049 (previously associated with type 2 diabetes in EA) a PheWAS association was identified for hemoglobin levels in AA. Of note, however, GALNT2 rs2144300 (previously associated with high-density lipoprotein cholesterol levels in EA) had multiple potentially novel PheWAS associations, with hypertension related phenotypes in AA and with serum calcium levels and coronary artery disease phenotypes in EA. PheWAS identifies associations for hypothesis generation and exploration of the genetic architecture of complex traits.
Author Summary
In phenome-wide association studies (PheWAS) all potential genetic variants in a dataset are systematically tested for association with all available phenotypes and traits that have been measured in study participants. By investigating the relationship between genetic variation and a diversity of phenotypes, there is the potential for uncovering novel relationships between single nucleotide polymorphisms (SNPs), phenotypes, and networks of interrelated phenotypes. PheWAS also can expose pleiotropy, provide novel mechanistic insights, and foster hypothesis generation. This approach is complementary to genome-wide association studies (GWAS) that test the association between hundreds of thousands, to over a million, single nucleotide polymorphisms and a single phenotype or limited phenotypic domain. The Population Architecture using Genomics and Epidemiology (PAGE) network has measures for a wide array of phenotypes and traits, including prevalent and incident status for clinical conditions and risk factors, as well as clinical parameters and intermediate biomarkers. We performed tests of association between a series of genome-wide association study (GWAS)–identified SNPs and a comprehensive range of phenotypes from the PAGE network in a high-throughput manner. We replicated a number of previously reported associations, validating the PheWAS approach. We also identified novel genotype–phenotype associations possibly representing pleiotropic effects.
doi:10.1371/journal.pgen.1003087
PMCID: PMC3561060  PMID: 23382687
25.  A Systematic Mapping Approach of 16q12.2/FTO and BMI in More Than 20,000 African Americans Narrows in on the Underlying Functional Variation: Results from the Population Architecture using Genomics and Epidemiology (PAGE) Study 
PLoS Genetics  2013;9(1):e1003171.
Genetic variants in intron 1 of the fat mass– and obesity-associated (FTO) gene have been consistently associated with body mass index (BMI) in Europeans. However, follow-up studies in African Americans (AA) have shown no support for some of the most consistently BMI–associated FTO index single nucleotide polymorphisms (SNPs). This is most likely explained by different race-specific linkage disequilibrium (LD) patterns and lower correlation overall in AA, which provides the opportunity to fine-map this region and narrow in on the functional variant. To comprehensively explore the 16q12.2/FTO locus and to search for second independent signals in the broader region, we fine-mapped a 646–kb region, encompassing the large FTO gene and the flanking gene RPGRIP1L by investigating a total of 3,756 variants (1,529 genotyped and 2,227 imputed variants) in 20,488 AAs across five studies. We observed associations between BMI and variants in the known FTO intron 1 locus: the SNP with the most significant p-value, rs56137030 (8.3×10−6) had not been highlighted in previous studies. While rs56137030was correlated at r2>0.5 with 103 SNPs in Europeans (including the GWAS index SNPs), this number was reduced to 28 SNPs in AA. Among rs56137030 and the 28 correlated SNPs, six were located within candidate intronic regulatory elements, including rs1421085, for which we predicted allele-specific binding affinity for the transcription factor CUX1, which has recently been implicated in the regulation of FTO. We did not find strong evidence for a second independent signal in the broader region. In summary, this large fine-mapping study in AA has substantially reduced the number of common alleles that are likely to be functional candidates of the known FTO locus. Importantly our study demonstrated that comprehensive fine-mapping in AA provides a powerful approach to narrow in on the functional candidate(s) underlying the initial GWAS findings in European populations.
Author Summary
Genetic variants within the fat mass– and obesity-associated (FTO) gene are associated with increased risk of obesity. To better understand which specific genetic variant(s) in this genetic region is associated with obesity risk, we attempt to genotype or impute all known genetic variants in the region and test for association with body mass index as a measurement of obesity in over 20,000 African Americans. We identified 29 potential candidate variants, of which one variant (rs1421085) is a particularly interesting candidate for future functional follow-up studies. Our example shows the powerful approach of studying a large African American population, substantially reducing the number of possible functional variants compared with European descent populations.
doi:10.1371/journal.pgen.1003171
PMCID: PMC3547789  PMID: 23341774

Results 1-25 (42)