PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (1204902)

Clipboard (0)
None

Related Articles

1.  Population Stratification and Patterns of Linkage Disequilibrium 
Genetic epidemiology  2009;33(Suppl 1):S88-S92.
Although the importance of selecting cases and controls from the same population has been recognized for decades, the recent advent of genome-wide association studies has heightened awareness of this issue. Because these studies typically deal with large samples, small differences in allele frequencies between cases and controls can easily reach statistical significance. When, unbeknownst to a researcher, cases and controls have different substructures, the number of false-positive findings is inflated. There have been three recent developments of purely statistical approaches to assessing the ancestral comparability of case and control samples: genomic control, structured association, and multivariate reduction analyses. The widespread use of high-throughput technology has allowed the quick and accurate genotyping of the large number of markers required by these methods.
Group 13 dealt with four population stratification issues: single-nucleotide polymorphism marker selection, association testing, non-standard methods, and linkage disequilibrium calculations in stratified or mixed ethnicity samples. We demonstrated that there are continuous axes of ethnic variation in both datasets of Genetic Analysis Workshop 16. Furthermore, ignoring this structure created p-value inflation for a variety of phenotypes. Principal-components analysis (or multidimensional scaling) can control inflation as covariates in a logistic regression. One can weight for local ancestry estimation and allow the use of related individuals. Problems arise in the presence of extremely high association or unusually strong linkage disequilibrium (e.g., in chromosomal inversions). Our group also reported a method for performing an association test controlling for substructure when genome-wide markers are not available to explicitly compute stratification.
doi:10.1002/gepi.20478
PMCID: PMC3133943  PMID: 19924707
genetic association; genome-wide association study; principal components; multidimensional scaling; ethnic substructure
2.  Trans-population Analysis of Genetic Mechanisms of Ethnic Disparities in Neuroblastoma Survival 
Background
Black patients with neuroblastoma have a higher prevalence of high-risk disease and worse outcome than white patients. We sought to investigate the relationship between genetic variation and the disparities in survival observed in neuroblastoma.
Methods
The analytic cohort was composed of 2709 patients. Principal components were used to assign patients to genomic ethnic clusters for survival analyses. Locus-specific ancestry was calculated for use in association analysis. The shorter spans of linkage disequilibrium in African populations may facilitate the fine mapping of causal variants in regions previously implicated by genome-wide association studies conducted primarily in patients of European descent. Thus, we evaluated 13 single nucleotide polymorphisms known to be associated with susceptibility to high-risk neuroblastoma from genome-wide association studies and all variants with highly divergent allele frequencies in reference African and European populations near the known susceptibility loci. All statistical tests were two-sided.
Results
African genomic ancestry was associated with high-risk neuroblastoma (P = .007) and lower event-free survival (P = .04, hazard ratio = 1.4, 95% confidence interval = 1.05 to 1.80). rs1033069 within SPAG16 (sperm associated antigen 16) was determined to have higher risk allele frequency in the African reference population and statistically significant association with high-risk disease in patients of European and African ancestry (P = 6.42×10−5, false discovery rate < 0.0015) in the overall cohort. Multivariable analysis using an additive model demonstrated that the SPAG16 single nucleotide polymorphism contributes to the observed ethnic disparities in high-risk disease and survival.
Conclusions
Our study demonstrates that common genetic variation influences neuroblastoma phenotype and contributes to the ethnic disparities in survival observed and illustrates the value of trans-population mapping.
doi:10.1093/jnci/djs503
PMCID: PMC3691940  PMID: 23243203
3.  Inflammation, Insulin Resistance, and Diabetes—Mendelian Randomization Using CRP Haplotypes Points Upstream 
PLoS Medicine  2008;5(8):e155.
Background
Raised C-reactive protein (CRP) is a risk factor for type 2 diabetes. According to the Mendelian randomization method, the association is likely to be causal if genetic variants that affect CRP level are associated with markers of diabetes development and diabetes. Our objective was to examine the nature of the association between CRP phenotype and diabetes development using CRP haplotypes as instrumental variables.
Methods and Findings
We genotyped three tagging SNPs (CRP + 2302G > A; CRP + 1444T > C; CRP + 4899T > G) in the CRP gene and measured serum CRP in 5,274 men and women at mean ages 49 and 61 y (Whitehall II Study). Homeostasis model assessment-insulin resistance (HOMA-IR) and hemoglobin A1c (HbA1c) were measured at age 61 y. Diabetes was ascertained by glucose tolerance test and self-report. Common major haplotypes were strongly associated with serum CRP levels, but unrelated to obesity, blood pressure, and socioeconomic position, which may confound the association between CRP and diabetes risk. Serum CRP was associated with these potential confounding factors. After adjustment for age and sex, baseline serum CRP was associated with incident diabetes (hazard ratio = 1.39 [95% confidence interval 1.29–1.51], HOMA-IR, and HbA1c, but the associations were considerably attenuated on adjustment for potential confounding factors. In contrast, CRP haplotypes were not associated with HOMA-IR or HbA1c (p = 0.52–0.92). The associations of CRP with HOMA-IR and HbA1c were all null when examined using instrumental variables analysis, with genetic variants as the instrument for serum CRP. Instrumental variables estimates differed from the directly observed associations (p = 0.007–0.11). Pooled analysis of CRP haplotypes and diabetes in Whitehall II and Northwick Park Heart Study II produced null findings (p = 0.25–0.88). Analyses based on the Wellcome Trust Case Control Consortium (1,923 diabetes cases, 2,932 controls) using three SNPs in tight linkage disequilibrium with our tagging SNPs also demonstrated null associations.
Conclusions
Observed associations between serum CRP and insulin resistance, glycemia, and diabetes are likely to be noncausal. Inflammation may play a causal role via upstream effectors rather than the downstream marker CRP.
Using a Mendelian randomization approach, Eric Brunner and colleagues show that the associations between serum C-reactive protein and insulin resistance, glycemia, and diabetes are likely to be noncausal.
Editors' Summary
Background.
Diabetes—a common, long-term (chronic) disease that causes heart, kidney, nerve, and eye problems and shortens life expectancy—is characterized by high levels of sugar (glucose) in the blood. In people without diabetes, blood sugar levels are controlled by the hormone insulin. Insulin is released by the pancreas after eating and “instructs” insulin-responsive muscle and fat cells to take up the glucose from the bloodstream that is produced by the digestion of food. In the early stages of type 2 diabetes (the commonest type of diabetes), the muscle and fat cells become nonresponsive to insulin (a condition called insulin resistance), and blood sugar levels increase. The pancreas responds by making more insulin—people with insulin resistance have high blood levels of both insulin and glucose. Eventually, however, the insulin-producing cells in the pancreas start to malfunction, insulin secretion decreases, and frank diabetes develops.
Why Was This Study Done?
Globally, about 200 million people have diabetes, but experts believe this number will double by 2030. Ways to prevent or delay the onset of diabetes are, therefore, urgently needed. One major risk factor for insulin resistance and diabetes is being overweight. According to one theory, increased body fat causes mild, chronic tissue inflammation, which leads to insulin resistance. Consistent with this idea, people with higher than normal amounts of the inflammatory protein C-reactive protein (CRP) in their blood have a high risk of developing diabetes. If inflammation does cause diabetes, then drugs that inhibit CRP might prevent diabetes. However, simply measuring CRP and determining whether the people with high levels develop diabetes cannot prove that CRP causes diabetes. Those people with high blood levels of CRP might have other unknown factors in common (confounding factors) that are the real causes of diabetes. In this study, the researchers use “Mendelian randomization” to examine whether increased blood CRP causes diabetes. Some variants of CRP (the gene that encodes CRP) increase the amount of CRP in the blood. Because these variants are inherited randomly, there is no likelihood of confounding factors, and an association between these variants and the development of insulin resistance and diabetes indicates, therefore, that increased CRP levels cause diabetes.
What Did the Researchers Do and Find?
The researchers measured blood CRP levels in more than 5,000 people enrolled in the Whitehall II study, which is investigating factors that affect disease development. They also used the “homeostasis model assessment-insulin resistance” (HOMA-IR) method to estimate insulin sensitivity from blood glucose and insulin measurements, and measured levels of hemoglobin A1c (HbA1c, hemoglobin with sugar attached—a measure of long-term blood sugar control) in these people. Finally, they looked at three “single polynucleotide polymorphisms” (SNPs, single nucleotide changes in a gene's DNA sequence; combinations of SNPs that are inherited as a block are called haplotypes) in CRP in each study participant. Common haplotypes of CRP were related to blood serum CRP levels and, as previously reported, increased blood CRP levels were associated with diabetes and with HOMA-IR and HbA1c values indicative of insulin resistance and poor blood sugar control, respectively. By contrast, CRP haplotypes were not related to HOMA-IR or HbA1c values. Similarly, pooled analysis of CRP haplotypes and diabetes in Whitehall II and another large study on health determinants (the Northwick Park Heart Study II) showed no association between CRP variants and diabetes risk. Finally, data from the Wellcome Trust Case Control Consortium also showed no association between CRP haplotypes and diabetes risk.
What Do These Findings Mean?
Together, these findings suggest that increased blood CRP levels are not responsible for the development of insulin resistance or diabetes, at least in European populations. It may be that there is a causal relationship between CRP levels and diabetes risk in other ethnic populations—further Mendelian randomization studies are needed to discover whether this is the case. For now, though, these findings suggest that drugs targeted against CRP are unlikely to prevent or delay the onset of diabetes. However, they do not discount the possibility that proteins involved earlier in the inflammatory process might cause diabetes and might thus represent good drug targets for diabetes prevention.
Additional Information.
Please access these Web sites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.0050155.
This study is further discussed in a PLoS Medicine Perspective by Bernard Keavney
The MedlinePlus encyclopedia provides information about diabetes and about C-reactive protein (in English and Spanish)
US National Institute of Diabetes and Digestive and Kidney Diseases provides patient information on all aspects of diabetes, including information on insulin resistance (in English and Spanish)
The International Diabetes Federation provides information about diabetes, including information on the global diabetes epidemic
The US Centers for Disease Control and Prevention provides information for the public and professionals on all aspects of diabetes (in English and Spanish)
Wikipedia has a page on Mendelian randomization (note: Wikipedia is a free online encyclopedia that anyone can edit; available in several languages)
doi:10.1371/journal.pmed.0050155
PMCID: PMC2504484  PMID: 18700811
4.  Meta-Analysis of Genome-Wide Association Studies in African Americans Provides Insights into the Genetic Architecture of Type 2 Diabetes 
Ng, Maggie C. Y. | Shriner, Daniel | Chen, Brian H. | Li, Jiang | Chen, Wei-Min | Guo, Xiuqing | Liu, Jiankang | Bielinski, Suzette J. | Yanek, Lisa R. | Nalls, Michael A. | Comeau, Mary E. | Rasmussen-Torvik, Laura J. | Jensen, Richard A. | Evans, Daniel S. | Sun, Yan V. | An, Ping | Patel, Sanjay R. | Lu, Yingchang | Long, Jirong | Armstrong, Loren L. | Wagenknecht, Lynne | Yang, Lingyao | Snively, Beverly M. | Palmer, Nicholette D. | Mudgal, Poorva | Langefeld, Carl D. | Keene, Keith L. | Freedman, Barry I. | Mychaleckyj, Josyf C. | Nayak, Uma | Raffel, Leslie J. | Goodarzi, Mark O. | Chen, Y-D Ida | Taylor, Herman A. | Correa, Adolfo | Sims, Mario | Couper, David | Pankow, James S. | Boerwinkle, Eric | Adeyemo, Adebowale | Doumatey, Ayo | Chen, Guanjie | Mathias, Rasika A. | Vaidya, Dhananjay | Singleton, Andrew B. | Zonderman, Alan B. | Igo, Robert P. | Sedor, John R. | Kabagambe, Edmond K. | Siscovick, David S. | McKnight, Barbara | Rice, Kenneth | Liu, Yongmei | Hsueh, Wen-Chi | Zhao, Wei | Bielak, Lawrence F. | Kraja, Aldi | Province, Michael A. | Bottinger, Erwin P. | Gottesman, Omri | Cai, Qiuyin | Zheng, Wei | Blot, William J. | Lowe, William L. | Pacheco, Jennifer A. | Crawford, Dana C. | Grundberg, Elin | Rich, Stephen S. | Hayes, M. Geoffrey | Shu, Xiao-Ou | Loos, Ruth J. F. | Borecki, Ingrid B. | Peyser, Patricia A. | Cummings, Steven R. | Psaty, Bruce M. | Fornage, Myriam | Iyengar, Sudha K. | Evans, Michele K. | Becker, Diane M. | Kao, W. H. Linda | Wilson, James G. | Rotter, Jerome I. | Sale, Michèle M. | Liu, Simin | Rotimi, Charles N. | Bowden, Donald W.
PLoS Genetics  2014;10(8):e1004517.
Type 2 diabetes (T2D) is more prevalent in African Americans than in Europeans. However, little is known about the genetic risk in African Americans despite the recent identification of more than 70 T2D loci primarily by genome-wide association studies (GWAS) in individuals of European ancestry. In order to investigate the genetic architecture of T2D in African Americans, the MEta-analysis of type 2 DIabetes in African Americans (MEDIA) Consortium examined 17 GWAS on T2D comprising 8,284 cases and 15,543 controls in African Americans in stage 1 analysis. Single nucleotide polymorphisms (SNPs) association analysis was conducted in each study under the additive model after adjustment for age, sex, study site, and principal components. Meta-analysis of approximately 2.6 million genotyped and imputed SNPs in all studies was conducted using an inverse variance-weighted fixed effect model. Replications were performed to follow up 21 loci in up to 6,061 cases and 5,483 controls in African Americans, and 8,130 cases and 38,987 controls of European ancestry. We identified three known loci (TCF7L2, HMGA2 and KCNQ1) and two novel loci (HLA-B and INS-IGF2) at genome-wide significance (4.15×10−94
Author Summary
Despite the higher prevalence of type 2 diabetes (T2D) in African Americans than in Europeans, recent genome-wide association studies (GWAS) were examined primarily in individuals of European ancestry. In this study, we performed meta-analysis of 17 GWAS in 8,284 cases and 15,543 controls to explore the genetic architecture of T2D in African Americans. Following replication in additional 6,061 cases and 5,483 controls in African Americans, and 8,130 cases and 38,987 controls of European ancestry, we identified two novel and three previous reported T2D loci reaching genome-wide significance. We also examined 158 loci previously reported to be associated with T2D or regulating glucose homeostasis. While 56% of these loci were shared between African Americans and the other populations, the strongest associations in African Americans are often found in nearby single nucleotide polymorphisms (SNPs) instead of the original SNPs reported in other populations due to differential genetic architecture across populations. Our results highlight the importance of performing genetic studies in non-European populations to fine map the causal genetic variants.
doi:10.1371/journal.pgen.1004517
PMCID: PMC4125087  PMID: 25102180
Genetic epidemiology  2013;37(8):10.1002/gepi.21764.
Population stratification is of primary interest in genetic studies to infer human evolution history and to avoid spurious findings in association testing. Although it is well studied with high-density single nucleotide polymorphisms (SNPs) in genome-wide association studies (GWASs), next-generation sequencing brings both new opportunities and challenges to uncovering population structures in finer scales. Several recent studies have noticed different confounding effects from variants of different minor allele frequencies (MAFs). In this paper, using a low-coverage sequencing dataset from the 1000 Genomes Project, we compared a popular method, principal component analysis (PCA), with a recently proposed spectral clustering technique, called spectral dimensional reduction (SDR), in detecting and adjusting for population stratification at the level of ethnic subgroups. We investigated the varying performance of adjusting for population stratification with different types and sets of variants when testing on different types of variants. One main conclusion is that principal components based on all variants or common variants were generally most effective in controlling inflations caused by population stratification; in particular, contrary to many speculations on the effectiveness of rare variants, we did not find much added value with the use of only rare variants. In addition, SDR was confirmed to be more robust than PCA, especially when applied to rare variants.
doi:10.1002/gepi.21764
PMCID: PMC3864649  PMID: 24123217
1000 Genomes Project; Association testing; Common variants; Principal component analysis; Rare variants; Spectral analysis
PLoS Medicine  2014;11(9):e1001713.
In this study, Proitsi and colleagues use a Mendelian randomization approach to dissect the causal nature of the association between circulating lipid levels and late onset Alzheimer's Disease (LOAD) and find that genetic predisposition to increased plasma cholesterol and triglyceride lipid levels is not associated with elevated LOAD risk.
Please see later in the article for the Editors' Summary
Background
Although altered lipid metabolism has been extensively implicated in the pathogenesis of Alzheimer disease (AD) through cell biological, epidemiological, and genetic studies, the molecular mechanisms linking cholesterol and AD pathology are still not well understood and contradictory results have been reported. We have used a Mendelian randomization approach to dissect the causal nature of the association between circulating lipid levels and late onset AD (LOAD) and test the hypothesis that genetically raised lipid levels increase the risk of LOAD.
Methods and Findings
We included 3,914 patients with LOAD, 1,675 older individuals without LOAD, and 4,989 individuals from the general population from six genome wide studies drawn from a white population (total n = 10,578). We constructed weighted genotype risk scores (GRSs) for four blood lipid phenotypes (high-density lipoprotein cholesterol [HDL-c], low-density lipoprotein cholesterol [LDL-c], triglycerides, and total cholesterol) using well-established SNPs in 157 loci for blood lipids reported by Willer and colleagues (2013). Both full GRSs using all SNPs associated with each trait at p<5×10−8 and trait specific scores using SNPs associated exclusively with each trait at p<5×10−8 were developed. We used logistic regression to investigate whether the GRSs were associated with LOAD in each study and results were combined together by meta-analysis. We found no association between any of the full GRSs and LOAD (meta-analysis results: odds ratio [OR] = 1.005, 95% CI 0.82–1.24, p = 0.962 per 1 unit increase in HDL-c; OR = 0.901, 95% CI 0.65–1.25, p = 0.530 per 1 unit increase in LDL-c; OR = 1.104, 95% CI 0.89–1.37, p = 0.362 per 1 unit increase in triglycerides; and OR = 0.954, 95% CI 0.76–1.21, p = 0.688 per 1 unit increase in total cholesterol). Results for the trait specific scores were similar; however, the trait specific scores explained much smaller phenotypic variance.
Conclusions
Genetic predisposition to increased blood cholesterol and triglyceride lipid levels is not associated with elevated LOAD risk. The observed epidemiological associations between abnormal lipid levels and LOAD risk could therefore be attributed to the result of biological pleiotropy or could be secondary to LOAD. Limitations of this study include the small proportion of lipid variance explained by the GRS, biases in case-control ascertainment, and the limitations implicit to Mendelian randomization studies. Future studies should focus on larger LOAD datasets with longitudinal sampled peripheral lipid measures and other markers of lipid metabolism, which have been shown to be altered in LOAD.
Please see later in the article for the Editors' Summary
Editors' Summary
Background
Currently, about 44 million people worldwide have dementia, a group of brain disorders characterized by an irreversible decline in memory, communication, and other “cognitive” functions. Dementia mainly affects older people and, because people are living longer, experts estimate that more than 135 million people will have dementia by 2050. The commonest form of dementia is Alzheimer disease. In this type of dementia, protein clumps called plaques and neurofibrillary tangles form in the brain and cause its degeneration. The earliest sign of Alzheimer disease is usually increasing forgetfulness. As the disease progresses, affected individuals gradually lose their ability to deal with normal daily activities such as dressing. They may become anxious or aggressive or begin to wander. They may also eventually lose control of their bladder and of other physical functions. At present, there is no cure for Alzheimer disease although some of its symptoms can be managed with drugs. Most people with the disease are initially cared for at home by relatives and other unpaid carers, but many patients end their days in a care home or specialist nursing home.
Why Was This Study Done?
Several lines of evidence suggest that lipid metabolism (how the body handles cholesterol and other fats) is altered in patients whose Alzheimer disease develops after the age of 60 years (late onset Alzheimer disease, LOAD). In particular, epidemiological studies (observational investigations that examine the patterns and causes of disease in populations) have found an association between high amounts of cholesterol in the blood in midlife and the risk of LOAD. However, observational studies cannot prove that abnormal lipid metabolism (dyslipidemia) causes LOAD. People with dyslipidemia may share other characteristics that cause both dyslipidemia and LOAD (confounding) or LOAD might actually cause dyslipidemia (reverse causation). Here, the researchers use “Mendelian randomization” to examine whether lifetime changes in lipid metabolism caused by genes have a causal impact on LOAD risk. In Mendelian randomization, causality is inferred from associations between genetic variants that mimic the effect of a modifiable risk factor and the outcome of interest. Because gene variants are inherited randomly, they are not prone to confounding and are free from reverse causation. So, if dyslipidemia causes LOAD, genetic variants that affect lipid metabolism should be associated with an altered risk of LOAD.
What Did the Researchers Do and Find?
The researchers investigated whether genetic predisposition to raised lipid levels increased the risk of LOAD in 10,578 participants (3,914 patients with LOAD, 1,675 elderly people without LOAD, and 4,989 population controls) using data collected in six genome wide studies looking for gene variants associated with Alzheimer disease. The researchers constructed a genotype risk score (GRS) for each participant using genetic risk markers for four types of blood lipids on the basis of the presence of single nucleotide polymorphisms (SNPs, a type of gene variant) in their DNA. When the researchers used statistical methods to investigate the association between the GRS and LOAD among all the study participants, they found no association between the GRS and LOAD.
What Do These Findings Mean?
These findings suggest that the genetic predisposition to raised blood levels of four types of lipid is not causally associated with LOAD risk. The accuracy of this finding may be affected by several limitations of this study, including the small proportion of lipid variance explained by the GRS and the validity of several assumptions that underlie all Mendelian randomization studies. Moreover, because all the participants in this study were white, these findings may not apply to people of other ethnic backgrounds. Given their findings, the researchers suggest that the observed epidemiological associations between abnormal lipid levels in the blood and variation in lipid levels for reasons other than genetics, or to LOAD risk could be secondary to variation in lipid levels for reasons other than genetics, or to LOAD, a possibility that can be investigated by studying blood lipid levels and other markers of lipid metabolism over time in large groups of patients with LOAD. Importantly, however, these findings provide new information about the role of lipids in LOAD development that may eventually lead to new therapeutic and public-health interventions for Alzheimer disease.
Additional Information
Please access these websites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.1001713.
The UK National Health Service Choices website provides information (including personal stories) about Alzheimer's disease
The UK not-for-profit organization Alzheimer's Society provides information for patients and carers about dementia, including personal experiences of living with Alzheimer's disease
The US not-for-profit organization Alzheimer's Association also provides information for patients and carers about dementia and personal stories about dementia
Alzheimer's Disease International is the international federation of Alzheimer disease associations around the world; it provides links to individual associations, information about dementia, and links to World Alzheimer Reports
MedlinePlus provides links to additional resources about Alzheimer's disease (in English and Spanish)
Wikipedia has a page on Mendelian randomization (note: Wikipedia is a free online encyclopedia that anyone can edit; available in several languages)
doi:10.1371/journal.pmed.1001713
PMCID: PMC4165594  PMID: 25226301
BMC Proceedings  2011;5(Suppl 9):S35.
Because of the low frequency of rare genetic variants in observed data, the statistical power of detecting their associations with target traits is usually low. The collapsing test of collective effect of multiple rare variants is an important and useful strategy to increase the power; in addition, family data may be enriched with causal rare variants and therefore provide extra power. However, when family data are used, both population structure and familial relatedness need to be adjusted for the possible inflation of false positives. Using a unified mixed linear model and family data, we compared six methods to detect the association between multiple rare variants and quantitative traits. Through the analysis of 200 replications of the quantitative trait Q2 from the Genetic Analysis Workshop 17 data set simulated for 697 subjects from 8 extended families, and based on quantile-quantile plots under the null and receiver operating characteristic curves, we compared the false-positive rate and power of these methods. We observed that adjusting for pedigree-based kinship gives the best control for false-positive rate, whereas adjusting for marker-based identity by state slightly outperforms in terms of power. An adjustment based on a principal components analysis slightly improves the false-positive rate and power. Taking into account type-1 error, power, and computational efficiency, we find that adjusting for pedigree-based kinship seems to be a good choice for the collective test of association between multiple rare variants and quantitative traits using family data.
doi:10.1186/1753-6561-5-S9-S35
PMCID: PMC3287871  PMID: 22373066
BMC Proceedings  2011;5(Suppl 9):S66.
Statistical tests on rare variant data may well have type I error rates that differ from their nominal levels. Here, we use the Genetic Analysis Workshop 17 data to estimate type I error rates and powers of three models for identifying rare variants associated with a phenotype: (1) by using the number of minor alleles, age, and smoking status as predictor variables; (2) by using the number of minor alleles, age, smoking status, and the identity of the population of the subject as predictor variables; and (3) by using the number of minor alleles, age, smoking status, and ancestry adjustment using 10 principal component scores. We studied both quantitative phenotype and a dichotomized phenotype. The model with principal component adjustment has type I error rates that are closer to the nominal level of significance of 0.05 for single-nucleotide polymorphisms (SNPs) in noncausal genes for the selected phenotype than the model directly adjusting for population. The principal component adjustment model type I error rates are also closer to the nominal level of 0.05 for noncausal SNPs located in causal genes for the phenotype. The power for causal SNPs with the principal component adjustment model is comparable to the power of the other methods. The power using the underlying quantitative phenotype is greater than the power using the dichotomized phenotype.
doi:10.1186/1753-6561-5-S9-S66
PMCID: PMC3287905  PMID: 22373457
Human genetics  2010;128(2):165-177.
It is well-known that population substructure may lead to confounding in case-control association studies. Here, we examined genetic structure in a large racially and ethnically diverse sample consisting of 5 ethnic groups of the Multiethnic Cohort study (African Americans, Japanese Americans, Latinos, European Americans and Native Hawaiians) using 2,509 SNPs distributed across the genome. Principal component analysis on 6,213 study participants, 18 Native Americans and 11 HapMap III populations revealed 4 important principal components (PCs): the first two separated Asians, Europeans and Africans, and the third and fourth corresponded to Native American and Native Hawaiian (Polynesian) ancestry, respectively. Individual ethnic composition derived from self-reported parental information matched well to genetic ancestry for Japanese and European Americans. STRUCTURE-estimated individual ancestral proportions for African Americans and Latinos are consistent with previous reports. We quantified the East Asian (mean 27%), European (mean 27%) and Polynesian (mean 46%) ancestral proportions for the first time, to our knowledge, for Native Hawaiians. Simulations based on realistic settings of case-control studies nested in the Multiethnic Cohort found that the effect of population stratification was modest and readily corrected by adjusting for race/ethnicity or by adjusting for top PCs derived from all SNPs or from ancestry informative markers; the power of these approaches was similar when averaged across causal variants simulated based on allele frequencies of the 2,509 genotyped markers. The bias may be large in case-only analysis of gene by gene interactions but it can be corrected by top PCs derived from all SNPs.
doi:10.1007/s00439-010-0841-4
PMCID: PMC3057055  PMID: 20499252
AIMs; African American; Native Hawaiian; Latino; admixture; principal component analysis
Objective
Thrombosis is a serious complication of systemic lupus erythematosus (SLE). Studies that have investigated the genetics of thrombosis in SLE are limited. We undertook this study to assess the association of previously implicated candidate genes, particularly Toll-like receptor (TLR) genes, with pathogenesis of thrombosis.
Methods
We genotyped 3,587 SLE patients from 3 multiethnic populations for 77 single-nucleotide polymorphisms (SNPs) in 10 genes, primarily in TLRs 2, 4, 7, and 9, and we also genotyped 64 ancestry-informative markers (AIMs). We first analyzed association with arterial and venous thrombosis in the combined population via logistic regression, adjusting for top principal components of the AIMs and other covariates. We also subjected an associated SNP, rs893629, to meta-analysis (after stratification by ethnicity and study population) to confirm the association and to test for study population or ethnicity effects.
Results
In the combined analysis, the SNP rs893629 in the KIAA0922/TLR2 region was significantly associated with arterial thrombosis (logistic P = 6.4 × 10−5, false discovery rate P = 0.0044). Two additional SNPs in TLR2 were also suggestive: rs1816702 (logistic P = 0.002) and rs4235232 (logistic P = 0.009). In the meta-analysis by study population, the odds ratio (OR) for arterial thrombosis with rs893629 was 2.44 (95% confidence interval 1.58–3.76), without evidence for heterogeneity (P = 0.78). By ethnicity, the effect was most significant among African Americans (OR 2.42, P = 3.5 × 10−4) and European Americans (OR 3.47, P = 0.024).
Conclusion
TLR2 gene variation is associated with thrombosis in SLE, particularly among African Americans and European Americans. There was no evidence of association among Hispanics, and results in Asian Americans were limited due to insufficient sample size. These results may help elucidate the pathogenesis of this important clinical manifestation.
doi:10.1002/art.38520
PMCID: PMC4269184  PMID: 24578102
BMC Proceedings  2011;5(Suppl 9):S103.
Identifying rare variants that are responsible for complex disease has been promoted by advances in sequencing technologies. However, statistical methods that can handle the vast amount of data generated and that can interpret the complicated relationship between disease and these variants have lagged. We apply a zero-inflated Poisson regression model to take into account the excess of zeros caused by the extremely low frequency of the 24,487 exonic variants in the Genetic Analysis Workshop 17 data. We grouped the 697 subjects in the data set as Europeans, Asians, and Africans based on principal components analysis and found the total number of rare variants per gene for each individual. We then analyzed these collapsed variants based on the assumption that rare variants are enriched in a group of people affected by a disease compared to a group of unaffected people. We also tested the hypothesis with quantitative traits Q1, Q2, and Q4. Analyses performed on the combined 697 individuals and on each ethnic group yielded different results. For the combined population analysis, we found that UGT1A1, which was not part of the simulation model, was associated with disease liability and that FLT1, which was a causal locus in the simulation model, was associated with Q1. Of the causal loci in the simulation models, FLT1 and KDR were associated with Q1 and VNN1 was correlated with Q2. No significant genes were associated with Q4. These results show the feasibility and capability of our new statistical model to detect multiple rare variants influencing disease risk.
doi:10.1186/1753-6561-5-S9-S103
PMCID: PMC3287826  PMID: 22373445
PLoS ONE  2011;6(7):e21591.
Understanding the role of genetic variation in human diseases remains an important problem to be solved in genomics. An important component of such variation consist of variations at single sites in DNA, or single nucleotide polymorphisms (SNPs). Typically, the problem of associating particular SNPs to phenotypes has been confounded by hidden factors such as the presence of population structure, family structure or cryptic relatedness in the sample of individuals being analyzed. Such confounding factors lead to a large number of spurious associations and missed associations. Various statistical methods have been proposed to account for such confounding factors such as linear mixed-effect models (LMMs) or methods that adjust data based on a principal components analysis (PCA), but these methods either suffer from low power or cease to be tractable for larger numbers of individuals in the sample. Here we present a statistical model for conducting genome-wide association studies (GWAS) that accounts for such confounding factors. Our method scales in runtime quadratic in the number of individuals being studied with only a modest loss in statistical power as compared to LMM-based and PCA-based methods when testing on synthetic data that was generated from a generalized LMM. Applying our method to both real and synthetic human genotype/phenotype data, we demonstrate the ability of our model to correct for confounding factors while requiring significantly less runtime relative to LMMs. We have implemented methods for fitting these models, which are available at http://www.microsoft.com/science.
doi:10.1371/journal.pone.0021591
PMCID: PMC3134455  PMID: 21765897
Fall, Tove | Hägg, Sara | Mägi, Reedik | Ploner, Alexander | Fischer, Krista | Horikoshi, Momoko | Sarin, Antti-Pekka | Thorleifsson, Gudmar | Ladenvall, Claes | Kals, Mart | Kuningas, Maris | Draisma, Harmen H. M. | Ried, Janina S. | van Zuydam, Natalie R. | Huikari, Ville | Mangino, Massimo | Sonestedt, Emily | Benyamin, Beben | Nelson, Christopher P. | Rivera, Natalia V. | Kristiansson, Kati | Shen, Huei-yi | Havulinna, Aki S. | Dehghan, Abbas | Donnelly, Louise A. | Kaakinen, Marika | Nuotio, Marja-Liisa | Robertson, Neil | de Bruijn, Renée F. A. G. | Ikram, M. Arfan | Amin, Najaf | Balmforth, Anthony J. | Braund, Peter S. | Doney, Alexander S. F. | Döring, Angela | Elliott, Paul | Esko, Tõnu | Franco, Oscar H. | Gretarsdottir, Solveig | Hartikainen, Anna-Liisa | Heikkilä, Kauko | Herzig, Karl-Heinz | Holm, Hilma | Hottenga, Jouke Jan | Hyppönen, Elina | Illig, Thomas | Isaacs, Aaron | Isomaa, Bo | Karssen, Lennart C. | Kettunen, Johannes | Koenig, Wolfgang | Kuulasmaa, Kari | Laatikainen, Tiina | Laitinen, Jaana | Lindgren, Cecilia | Lyssenko, Valeriya | Läärä, Esa | Rayner, Nigel W. | Männistö, Satu | Pouta, Anneli | Rathmann, Wolfgang | Rivadeneira, Fernando | Ruokonen, Aimo | Savolainen, Markku J. | Sijbrands, Eric J. G. | Small, Kerrin S. | Smit, Jan H. | Steinthorsdottir, Valgerdur | Syvänen, Ann-Christine | Taanila, Anja | Tobin, Martin D. | Uitterlinden, Andre G. | Willems, Sara M. | Willemsen, Gonneke | Witteman, Jacqueline | Perola, Markus | Evans, Alun | Ferrières, Jean | Virtamo, Jarmo | Kee, Frank | Tregouet, David-Alexandre | Arveiler, Dominique | Amouyel, Philippe | Ferrario, Marco M. | Brambilla, Paolo | Hall, Alistair S. | Heath, Andrew C. | Madden, Pamela A. F. | Martin, Nicholas G. | Montgomery, Grant W. | Whitfield, John B. | Jula, Antti | Knekt, Paul | Oostra, Ben | van Duijn, Cornelia M. | Penninx, Brenda W. J. H. | Davey Smith, George | Kaprio, Jaakko | Samani, Nilesh J. | Gieger, Christian | Peters, Annette | Wichmann, H.-Erich | Boomsma, Dorret I. | de Geus, Eco J. C. | Tuomi, TiinaMaija | Power, Chris | Hammond, Christopher J. | Spector, Tim D. | Lind, Lars | Orho-Melander, Marju | Palmer, Colin Neil Alexander | Morris, Andrew D. | Groop, Leif | Järvelin, Marjo-Riitta | Salomaa, Veikko | Vartiainen, Erkki | Hofman, Albert | Ripatti, Samuli | Metspalu, Andres | Thorsteinsdottir, Unnur | Stefansson, Kari | Pedersen, Nancy L. | McCarthy, Mark I. | Ingelsson, Erik | Prokopenko, Inga
PLoS Medicine  2013;10(6):e1001474.
In this study, Prokopenko and colleagues provide novel evidence for causal relationship between adiposity and heart failure and increased liver enzymes using a Mendelian randomization study design.
Please see later in the article for the Editors' Summary
Background
The association between adiposity and cardiometabolic traits is well known from epidemiological studies. Whilst the causal relationship is clear for some of these traits, for others it is not. We aimed to determine whether adiposity is causally related to various cardiometabolic traits using the Mendelian randomization approach.
Methods and Findings
We used the adiposity-associated variant rs9939609 at the FTO locus as an instrumental variable (IV) for body mass index (BMI) in a Mendelian randomization design. Thirty-six population-based studies of individuals of European descent contributed to the analyses.
Age- and sex-adjusted regression models were fitted to test for association between (i) rs9939609 and BMI (n = 198,502), (ii) rs9939609 and 24 traits, and (iii) BMI and 24 traits. The causal effect of BMI on the outcome measures was quantified by IV estimators. The estimators were compared to the BMI–trait associations derived from the same individuals. In the IV analysis, we demonstrated novel evidence for a causal relationship between adiposity and incident heart failure (hazard ratio, 1.19 per BMI-unit increase; 95% CI, 1.03–1.39) and replicated earlier reports of a causal association with type 2 diabetes, metabolic syndrome, dyslipidemia, and hypertension (odds ratio for IV estimator, 1.1–1.4; all p<0.05). For quantitative traits, our results provide novel evidence for a causal effect of adiposity on the liver enzymes alanine aminotransferase and gamma-glutamyl transferase and confirm previous reports of a causal effect of adiposity on systolic and diastolic blood pressure, fasting insulin, 2-h post-load glucose from the oral glucose tolerance test, C-reactive protein, triglycerides, and high-density lipoprotein cholesterol levels (all p<0.05). The estimated causal effects were in agreement with traditional observational measures in all instances except for type 2 diabetes, where the causal estimate was larger than the observational estimate (p = 0.001).
Conclusions
We provide novel evidence for a causal relationship between adiposity and heart failure as well as between adiposity and increased liver enzymes.
Please see later in the article for the Editors' Summary
Editors' Summary
Cardiovascular disease (CVD)—disease that affects the heart and/or the blood vessels—is a major cause of illness and death worldwide. In the US, for example, coronary heart disease—a CVD in which narrowing of the heart's blood vessels by fatty deposits slows the blood supply to the heart and may eventually cause a heart attack—is the leading cause of death, and stroke—a CVD in which the brain's blood supply is interrupted—is the fourth leading cause of death. Globally, both the incidence of CVD (the number of new cases in a population every year) and its prevalence (the proportion of the population with CVD) are increasing, particularly in low- and middle-income countries. This increasing burden of CVD is occurring in parallel with a global increase in the incidence and prevalence of obesity—having an unhealthy amount of body fat (adiposity)—and of metabolic diseases—conditions such as diabetes in which metabolism (the processes that the body uses to make energy from food) is disrupted, with resulting high blood sugar and damage to the blood vessels.
Why Was This Study Done?
Epidemiological studies—investigations that record the patterns and causes of disease in populations—have reported an association between adiposity (indicated by an increased body mass index [BMI], which is calculated by dividing body weight in kilograms by height in meters squared) and cardiometabolic traits such as coronary heart disease, stroke, heart failure (a condition in which the heart is incapable of pumping sufficient amounts of blood around the body), diabetes, high blood pressure (hypertension), and high blood cholesterol (dyslipidemia). However, observational studies cannot prove that adiposity causes any particular cardiometabolic trait because overweight individuals may share other characteristics (confounding factors) that are the real causes of both obesity and the cardiometabolic disease. Moreover, it is possible that having CVD or a metabolic disease causes obesity (reverse causation). For example, individuals with heart failure cannot do much exercise, so heart failure may cause obesity rather than vice versa. Here, the researchers use “Mendelian randomization” to examine whether adiposity is causally related to various cardiometabolic traits. Because gene variants are inherited randomly, they are not prone to confounding and are free from reverse causation. It is known that a genetic variant (rs9939609) within the genome region that encodes the fat-mass- and obesity-associated gene (FTO) is associated with increased BMI. Thus, an investigation of the associations between rs9939609 and cardiometabolic traits can indicate whether obesity is causally related to these traits.
What Did the Researchers Do and Find?
The researchers analyzed the association between rs9939609 (the “instrumental variable,” or IV) and BMI, between rs9939609 and 24 cardiometabolic traits, and between BMI and the same traits using genetic and health data collected in 36 population-based studies of nearly 200,000 individuals of European descent. They then quantified the strength of the causal association between BMI and the cardiometabolic traits by calculating “IV estimators.” Higher BMI showed a causal relationship with heart failure, metabolic syndrome (a combination of medical disorders that increases the risk of developing CVD), type 2 diabetes, dyslipidemia, hypertension, increased blood levels of liver enzymes (an indicator of liver damage; some metabolic disorders involve liver damage), and several other cardiometabolic traits. All the IV estimators were similar to the BMI–cardiovascular trait associations (observational estimates) derived from the same individuals, with the exception of diabetes, where the causal estimate was higher than the observational estimate, probably because the observational estimate is based on a single BMI measurement, whereas the causal estimate considers lifetime changes in BMI.
What Do These Findings Mean?
Like all Mendelian randomization studies, the reliability of the causal associations reported here depends on several assumptions made by the researchers. Nevertheless, these findings provide support for many previously suspected and biologically plausible causal relationships, such as that between adiposity and hypertension. They also provide new insights into the causal effect of obesity on liver enzyme levels and on heart failure. In the latter case, these findings suggest that a one-unit increase in BMI might increase the incidence of heart failure by 17%. In the US, this corresponds to 113,000 additional cases of heart failure for every unit increase in BMI at the population level. Although additional studies are needed to confirm and extend these findings, these results suggest that global efforts to reduce the burden of obesity will likely also reduce the occurrence of CVD and metabolic disorders.
Additional Information
Please access these websites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.1001474.
The American Heart Association provides information on all aspects of cardiovascular disease and tips on keeping the heart healthy, including weight management (in several languages); its website includes personal stories about stroke and heart attacks
The US Centers for Disease Control and Prevention has information on heart disease, stroke, and all aspects of overweight and obesity (in English and Spanish)
The UK National Health Service Choices website provides information about cardiovascular disease and obesity, including a personal story about losing weight
The World Health Organization provides information on obesity (in several languages)
The International Obesity Taskforce provides information about the global obesity epidemic
Wikipedia has a page on Mendelian randomization (note: Wikipedia is a free online encyclopedia that anyone can edit; available in several languages)
MedlinePlus provides links to other sources of information on heart disease, on vascular disease, on obesity, and on metabolic disorders (in English and Spanish)
The International Association for the Study of Obesity provides maps and information about obesity worldwide
The International Diabetes Federation has a web page that describes types, complications, and risk factors of diabetes
doi:10.1371/journal.pmed.1001474
PMCID: PMC3692470  PMID: 23824655
PLoS Medicine  2011;8(10):e1001112.
Using mendelian randomization, Roman Pfister and colleagues demonstrate a potentially causal link between low levels of B-type natriuretic peptide (BNP), a hormone released by damaged hearts, and the development of type 2 diabetes.
Background
Genetic and epidemiological evidence suggests an inverse association between B-type natriuretic peptide (BNP) levels in blood and risk of type 2 diabetes (T2D), but the prospective association of BNP with T2D is uncertain, and it is unclear whether the association is confounded.
Methods and Findings
We analysed the association between levels of the N-terminal fragment of pro-BNP (NT-pro-BNP) in blood and risk of incident T2D in a prospective case-cohort study and genotyped the variant rs198389 within the BNP locus in three T2D case-control studies. We combined our results with existing data in a meta-analysis of 11 case-control studies. Using a Mendelian randomization approach, we compared the observed association between rs198389 and T2D to that expected from the NT-pro-BNP level to T2D association and the NT-pro-BNP difference per C allele of rs198389. In participants of our case-cohort study who were free of T2D and cardiovascular disease at baseline, we observed a 21% (95% CI 3%–36%) decreased risk of incident T2D per one standard deviation (SD) higher log-transformed NT-pro-BNP levels in analysis adjusted for age, sex, body mass index, systolic blood pressure, smoking, family history of T2D, history of hypertension, and levels of triglycerides, high-density lipoprotein cholesterol, and low-density lipoprotein cholesterol. The association between rs198389 and T2D observed in case-control studies (odds ratio = 0.94 per C allele, 95% CI 0.91–0.97) was similar to that expected (0.96, 0.93–0.98) based on the pooled estimate for the log-NT-pro-BNP level to T2D association derived from a meta-analysis of our study and published data (hazard ratio = 0.82 per SD, 0.74–0.90) and the difference in NT-pro-BNP levels (0.22 SD, 0.15–0.29) per C allele of rs198389. No significant associations were observed between the rs198389 genotype and potential confounders.
Conclusions
Our results provide evidence for a potential causal role of the BNP system in the aetiology of T2D. Further studies are needed to investigate the mechanisms underlying this association and possibilities for preventive interventions.
Please see later in the article for the Editors' Summary
Editors' Summary
Background
Worldwide, nearly 250 million people have diabetes, and this number is increasing rapidly. Diabetes is characterized by dangerous amounts of sugar (glucose) in the blood. Blood sugar levels are normally controlled by insulin, a hormone that the pancreas releases after meals (digestion of food produces glucose). In people with type 2 diabetes (the most common form of diabetes), blood sugar control fails because the fat and muscle cells that usually respond to insulin by removing sugar from the blood become insulin resistant. Type 2 diabetes can be controlled with diet and exercise, and with drugs that help the pancreas make more insulin or that make cells more sensitive to insulin. The long-term complications of diabetes, which include kidney failure and an increased risk of cardiovascular problems such as heart disease and stroke, reduce the life expectancy of people with diabetes by about 10 years compared to people without diabetes.
Why Was This Study Done?
Because the causes of type 2 diabetes are poorly understood, it is hard to devise ways to prevent the condition. Recently, B-type natriuretic peptide (BNP, a hormone released by damaged hearts) has been implicated in type 2 diabetes development in cross-sectional studies (investigations in which data are collected at a single time point from a population to look for associations between an illness and potential risk factors). Although these studies suggest that high levels of BNP may protect against type 2 diabetes, they cannot prove a causal link between BNP levels and diabetes because the study participants with low BNP levels may share some another unknown factor (a confounding factor) that is the real cause of both diabetes and altered BNP levels. Here, the researchers use an approach called “Mendelian randomization” to examine whether reduced BNP levels contribute to causing type 2 diabetes. It is known that a common genetic variant (rs198389) within the genome region that encodes BNP is associated with a reduced risk of type 2 diabetes. Because gene variants are inherited randomly, they are not subject to confounding. So, by investigating the association between BNP gene variants that alter NT-pro-BNP (a molecule created when BNP is being produced) levels and the development of type 2 diabetes, the researchers can discover whether BNP is causally involved in this chronic condition.
What Did the Researchers Do and Find?
The researchers analyzed the association between blood levels of NT-pro-BNP at baseline in 440 participants of the EPIC-Norfolk study (a prospective population-based study of lifestyle factors and the risk of chronic diseases) who subsequently developed diabetes and in 740 participants who did not develop diabetes. In this prospective case-cohort study, the risk of developing type 2 diabetes was associated with lower NT-pro-BNP levels. They also genotyped (sequenced) rs198389 in the participants of three case-control studies of type 2 diabetes (studies in which potential risk factors for type 2 diabetes were examined in people with type 2 diabetes and matched controls living in the East of England), and combined these results with those of eight similar published case-control studies. Finally, the researchers showed that the association between rs198389 and type 2 diabetes measured in the case-control studies was similar to the expected association calculated from the association between NT-pro-BNP level and type 2 diabetes obtained from the prospective case-cohort study and the association between rs198389 and BNP levels obtained from the EPIC-Norfolk study and other published studies.
What Do These Findings Mean?
The results of this Mendelian randomization study provide evidence for a causal, protective role of the BNP hormone system in the development of type 2 diabetes. That is, these findings suggest that low levels of BNP are partly responsible for the development of type 2 diabetes. Because the participants in all the individual studies included in this analysis were of European descent, these findings may not be generalizable to other ethnicities. Moreover, they provide no explanation of how alterations in the BNP hormone system might affect the development of type 2 diabetes. Nevertheless, the demonstration of a causal link between the BNP hormone system and type 2 diabetes suggests that BNP may be a potential target for interventions designed to prevent type 2 diabetes, particularly since the feasibility of altering BNP levels with drugs has already been proven in patients with cardiovascular disease.
Additional Information
Please access these websites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.1001112.
The International Diabetes Federation provides information about all aspects of diabetes
The US National Diabetes Information Clearinghouse provides detailed information about diabetes for patients, health-care professionals, and the general public (in English and Spanish)
The UK National Health Service Choices website also provides information for patients and carers about type 2 diabetes and includes people's stories about diabetes
MedlinePlus provides links to further resources and advice about diabetes (in English and Spanish)
Wikipedia has pages on BNP and on Mendelian randomization (note: Wikipedia is a free online encyclopedia that anyone can edit; available in several languages)
The charity Healthtalkonline has interviews with people about their experiences of diabetes; the charity Diabetes UK has a further selection of stories from people with diabetes
doi:10.1371/journal.pmed.1001112
PMCID: PMC3201934  PMID: 22039354
PLoS ONE  2010;5(10):e13336.
The population of Costa Rica (CR) represents an admixture of major continental populations. An investigation of the CR population structure would provide an important foundation for mapping genetic variants underlying common diseases and traits. We conducted an analysis of 1,301 women from the Guanacaste region of CR using 27,904 single nucleotide polymorphisms (SNPs) genotyped on a custom Illumina InfiniumII iSelect chip. The program STRUCTURE was used to compare the CR Guanacaste sample with four continental reference samples, including HapMap Europeans (CEU), East Asians (JPT+CHB), West African Yoruba (YRI), as well as Native Americans (NA) from the Illumina iControl database. Our results show that the CR Guanacaste sample comprises a three-way admixture estimated to be 43% European, 38% Native American and 15% West African. An estimated 4% residual Asian ancestry may be within the error range. Results from principal components analysis reveal a correlation between genetic and geographic distance. The magnitude of linkage disequilibrium (LD) measured by the number of tagging SNPs required to cover the same region in the genome in the CR Guanacaste sample appeared to be weaker than that observed in CEU, JPT+CHB and NA reference samples but stronger than that of the HapMap YRI sample. Based on the clustering pattern observed in both STRUCTURE and principal components analysis, two subpopulations were identified that differ by approximately 20% in LD block size averaged over all LD blocks identified by Haploview. We also show in a simulated association study conducted within the two subpopulations, that the failure to account for population stratification (PS) could lead to a noticeable inflation in the false positive rate. However, we further demonstrate that existing PS adjustment approaches can reduce the inflation to an acceptable level for gene discovery.
doi:10.1371/journal.pone.0013336
PMCID: PMC2954167  PMID: 20967209
PLoS Medicine  2014;11(10):e1001751.
In this study, Richards and colleagues undertook a Mendelian randomization study to determine whether vitamin D binding protein (DBP) levels have a causal effect on common calcemic and cardiometabolic diseases. They concluded that DBP has no demonstrable causal effect on any of the diseases or traits investigated here, except Vit D levels.
Please see later in the article for the Editors' Summary
Background
Observational studies have shown that vitamin D binding protein (DBP) levels, a key determinant of 25-hydroxy-vitamin D (25OHD) levels, and 25OHD levels themselves both associate with risk of disease. If 25OHD levels have a causal influence on disease, and DBP lies in this causal pathway, then DBP levels should likewise be causally associated with disease. We undertook a Mendelian randomization study to determine whether DBP levels have causal effects on common calcemic and cardiometabolic disease.
Methods and Findings
We measured DBP and 25OHD levels in 2,254 individuals, followed for up to 10 y, in the Canadian Multicentre Osteoporosis Study (CaMos). Using the single nucleotide polymorphism rs2282679 as an instrumental variable, we applied Mendelian randomization methods to determine the causal effect of DBP on calcemic (osteoporosis and hyperparathyroidism) and cardiometabolic diseases (hypertension, type 2 diabetes, coronary artery disease, and stroke) and related traits, first in CaMos and then in large-scale genome-wide association study consortia. The effect allele was associated with an age- and sex-adjusted decrease in DBP level of 27.4 mg/l (95% CI 24.7, 30.0; n = 2,254). DBP had a strong observational and causal association with 25OHD levels (p = 3.2×10−19). While DBP levels were observationally associated with calcium and body mass index (BMI), these associations were not supported by causal analyses. Despite well-powered sample sizes from consortia, there were no associations of rs2282679 with any other traits and diseases: fasting glucose (0.00 mmol/l [95% CI −0.01, 0.01]; p = 1.00; n = 46,186); fasting insulin (0.01 pmol/l [95% CI −0.00, 0.01,]; p = 0.22; n = 46,186); BMI (0.00 kg/m2 [95% CI −0.01, 0.01]; p = 0.80; n = 127,587); bone mineral density (0.01 g/cm2 [95% CI −0.01, 0.03]; p = 0.36; n = 32,961); mean arterial pressure (−0.06 mm Hg [95% CI −0.19, 0.07]); p = 0.36; n = 28,775); ischemic stroke (odds ratio [OR] = 1.00 [95% CI 0.97, 1.04]; p = 0.92; n = 12,389/62,004 cases/controls); coronary artery disease (OR = 1.02 [95% CI 0.99, 1.05]; p = 0.31; n = 22,233/64,762); or type 2 diabetes (OR = 1.01 [95% CI 0.97, 1.05]; p = 0.76; n = 9,580/53,810).
Conclusions
DBP has no demonstrable causal effect on any of the diseases or traits investigated here, except 25OHD levels. It remains to be determined whether 25OHD has a causal effect on these outcomes independent of DBP.
Please see later in the article for the Editors' Summary
Editors' Summary
Background
Vitamin D deficiency is an increasingly common public health concern. According to some estimates, more than a billion people worldwide may be vitamin D deficient. Indeed, many people living in the US and Europe (in particular, elderly people, breastfed infants, people with dark skin, and obese individuals) have serum (circulating) 25-hydroxy-vitamin D (25OHD) levels below 50 nmol/l, the threshold for vitamin D deficiency. Vitamin D helps the body absorb calcium, a mineral that is essential for healthy bones. Consequently, vitamin D deficiency can lead to calcemic diseases such as rickets (a condition that affects bone development in children), osteomalacia (soft bones in adults), and osteoporosis (a condition in which the bones weaken and become susceptible to fracture). We get most of our vitamin D needs from our skin, which makes vitamin D after exposure to sunlight. Vitamin D is also found naturally in oily fish and eggs, and is added to some other foods, including cereals and milk, but some people need to take vitamin D supplements to avoid vitamin D deficiency.
Why Was This Study Done?
Observational studies have reported that the low levels of serum 25OHD and serum vitamin D binding protein (DBP, a key determinant of serum 25OHD level) are both associated with the risk of several common diseases and traits. Such studies have implicated vitamin D deficiency in cardiometabolic disease (cardiovascular diseases that affect the heart and/or blood vessels and metabolic diseases that affect the cellular chemical reactions needed to sustain life), in some cancers, and in Alzheimer disease. But observational studies cannot prove that vitamin D deficiency or DBP levels actually cause any of these diseases. So, for example, an observational study might report an association between vitamin D deficiency and type 2 diabetes (a metabolic disease), but the individuals who develop type 2 diabetes might share another unknown characteristic that is actually responsible for disease development (a confounding factor). Alternatively, type 2 diabetes might reduce circulating vitamin D levels (reverse causation). Here, the researchers undertake a Mendelian randomization study to determine whether circulating DBP levels have causal effects on calcemic and cardiometabolic diseases. In Mendelian randomization, causality is inferred from associations between genetic variants that mimic the influence of a modifiable environmental exposure and the outcome of interest. Because gene variants are inherited randomly, they are not prone to confounding and are free from reverse causation. So, if low DBP levels lead to low serum 25OHD levels, and vitamin D levels have a causal effect on common diseases, genetic variants associated with low DBP levels should be associated with the development of common diseases.
What Did the Researchers Do and Find?
The researchers analyzed the association between a genetic variant called single nucleotide polymorphism (SNP) rs2282679, which is known to alter DBP levels, and calcemic and cardiometabolic diseases and related traits in 2,254 participants in the Canadian Multicentre Osteoporosis Study (CaMos). The researchers report that there was a strong association between SNP rs2282679 and both serum DBP and 25OHD levels among the CaMos participants. However, there were no significant associations (associations unlikely to have occurred by chance) between SNP rs2282679 and calcium level, osteoporosis, or several cardiometabolic diseases, including heart attacks and diabetes. Moreover, when the researchers examined publically available genome-wide association study data collected by several international consortia investigating genetic influences on disease, they found no significant associations between rs2282679 and a wide range of calcemic and cardiometabolic diseases.
What Do These Findings Mean?
In this Mendelian randomization study, DBP level had no demonstrable causal effect on any of the calcemic or cardiometabolic diseases or traits investigated, except 25OHD level. Because most of the participants in CaMos and the international consortia were of European descent, these findings are applicable only to people of European ancestry. Moreover, like all Mendelian randomization studies, the reliability of these findings depends on several assumptions made by the researchers. Notably, although this study strongly suggests that DBP level does not have a causal influence on several common diseases, it remains to be determined whether 25OHD has a causal effect on any calcemic or cardiometabolic outcomes independent of DBP level.
Additional Information
Please access these websites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.1001751.
The UK National Health Service Choices website provides information about vitamin D and about how to get vitamin D from sunshine; “Behind the Headlines” articles describe a recent observational study that reported an association between vitamin D deficiency and Alzheimer disease and the media coverage of this study, other health claims made for vitamin D, and a randomized control trial that questioned the role of vitamin D in disease
The US National Institutes of Health Office of Dietary Supplements provides information about vitamin D (in English and Spanish)
The US Centers for Disease Control and Prevention provides information about the vitamin D status of the US population
MedlinePlus has links to further information about vitamin D (in English and Spanish)
Information about the Canadian Multicentre Osteoporosis Study is available
Wikipedia has a page on Mendelian randomization (note: Wikipedia is a free online encyclopedia that anyone can edit; available in several languages)
doi:10.1371/journal.pmed.1001751
PMCID: PMC4211663  PMID: 25350643
Genetic epidemiology  2011;35(Suppl 1):S56-S60.
As part of Genetic Analysis Workshop 17 (GAW17), our group considered the application of novel and standard approaches to the analysis of genotype-phenotype association in next-generation sequencing data. Our group identified a major issue in the analysis of the GAW17 next-generation sequencing data: type I error and false-positive report probability rates higher than those expected based on empirical type I error levels (as high as 90%). Two main causes emerged: population stratification and long-range correlation (gametic phase disequilibrium) between rare variants. Population stratification was expected because of the diverse sample. Correlation between rare variants was attributable to both random causes (e.g., nearly 10,000 of 25,000 markers were private variants, and the sample size was small [n = 697]) and nonrandom causes (more correlation was observed than was expected by random chance). Principal components analysis was used to control for population structure and helped to minimize type I errors, but this was at the expense of identifying fewer causal variants. A novel multiple regression approach showed promise to handle correlation between markers. Further work is needed, first, to identify best practices for the control of type I errors in the analysis of sequencing data and then to explore and compare the many promising new aggregating approaches for identifying markers associated with disease phenotypes.
doi:10.1002/gepi.20650
PMCID: PMC3249221  PMID: 22128060
population structure; correlated markers; next-generation sequencing
PLoS Genetics  2014;10(12):e1004818.
A large fraction of human genes are regulated by genetic variation near the transcribed sequence (cis-eQTL, expression quantitative trait locus), and many cis-eQTLs have implications for human disease. Less is known regarding the effects of genetic variation on expression of distant genes (trans-eQTLs) and their biological mechanisms. In this work, we use genome-wide data on SNPs and array-based expression measures from mononuclear cells obtained from a population-based cohort of 1,799 Bangladeshi individuals to characterize cis- and trans-eQTLs and determine if observed trans-eQTL associations are mediated by expression of transcripts in cis with the SNPs showing trans-association, using Sobel tests of mediation. We observed 434 independent trans-eQTL associations at a false-discovery rate of 0.05, and 189 of these trans-eQTLs were also cis-eQTLs (enrichment P<0.0001). Among these 189 trans-eQTL associations, 39 were significantly attenuated after adjusting for a cis-mediator based on Sobel P<10-5. We attempted to replicate 21 of these mediation signals in two European cohorts, and while only 7 trans-eQTL associations were present in one or both cohorts, 6 showed evidence of cis-mediation. Analyses of simulated data show that complete mediation will be observed as partial mediation in the presence of mediator measurement error or imperfect LD between measured and causal variants. Our data demonstrates that trans-associations can become significantly stronger or switch directions after adjusting for a potential mediator. Using simulated data, we demonstrate that this phenomenon is expected in the presence of strong cis-trans confounding and when the measured cis-transcript is correlated with the true (unmeasured) mediator. In conclusion, by applying mediation analysis to eQTL data, we show that a substantial fraction of observed trans-eQTL associations can be explained by cis-mediation. Future studies should focus on understanding the mechanisms underlying widespread cis-mediation and their relevance to disease biology, as well as using mediation analysis to improve eQTL discovery.
Author Summary
Expression quantitative trait locus (eQTL) studies have demonstrated that human genes can be regulated by genetic variation residing close to the gene (cis-eQTLs) or in a distant region or on a different chromosome (trans-eQTLs). While cis-eQTL variants are likely to affect transcription factor binding or chromatin structure, our understanding of the mechanisms underlying trans-eQTLs is incomplete. We hypothesize that a substantial fraction of trans-eQTLs influence expression of distant genes through mediation by expression levels of a cis-transcript. In this paper, we use genome-wide SNPs and expression data for 1,799 South Asians to identify cis- and trans-eQTLs and to test our hypothesis using Sobel tests of mediation. Among 189 observed trans-eQTL associations, we provide evidence of cis-mediation for 39, 6 of which show mediation in an independent European cohort. We used simulated data to demonstrate that complete mediation will be observed as partial mediation in the presence of mediator measurement error or imperfect LD between measured and causal variants. We also demonstrate how unobserved confounding variables and incorrect mediator selection can bias mediation estimates. In conclusion, we have identified cis-mediators for many trans-eQTLs and described a mediation analysis approach that can be used to validate, characterize, and enhance discovery of trans-eQTLs.
doi:10.1371/journal.pgen.1004818
PMCID: PMC4256471  PMID: 25474530
BMC Proceedings  2009;3(Suppl 7):S107.
Background
To account for population stratification in association studies, principal-components analysis is often performed on single-nucleotide polymorphisms (SNPs) across the genome. Here, we use Framingham Heart Study (FHS) Genetic Analysis Workshop 16 data to compare the performance of local ancestry adjustment for population stratification based on principal components (PCs) estimated from SNPs in a local chromosomal region with global ancestry adjustment based on PCs estimated from genome-wide SNPs.
Methods
Standardized height residuals from unrelated adults from the FHS Offspring Cohort were averaged from longitudinal data. PCs of SNP genotype data were calculated to represent individual's ancestry either 1) globally using all SNPs across the genome or 2) locally using SNPs in adjacent 20-Mbp regions within each chromosome. We assessed the extent to which there were differences in association studies of height depending on whether PCs for global, local, or both global and local ancestry were included as covariates.
Results
The correlations between local and global PCs were low (r < 0.12), suggesting variability between local and global ancestry estimates. Genome-wide association tests without any ancestry adjustment demonstrated an inflated type I error rate that decreased with adjustment for local ancestry, global ancestry, or both. A known spurious association was replicated for SNPs within the lactase gene, and this false-positive association was abolished by adjustment with local or global ancestry PCs.
Conclusion
Population stratification is a potential source of bias in this seemingly homogenous FHS population. However, local and global PCs derived from SNPs appear to provide adequate information about ancestry.
PMCID: PMC2795878  PMID: 20017971
BMC Proceedings  2009;3(Suppl 7):S108.
Population structure occurs when a sample is composed of individuals with different ancestries and can result in excess type I error in genome-wide association studies. Genome-wide principal-component analysis (PCA) has become a popular method for identifying and adjusting for subtle population structure in association studies. Using the Genetic Analysis Workshop 16 (GAW16) NARAC data, we explore two unresolved issues concerning the use of genome-wide PCA to account for population structure in genetic associations studies: the choice of single-nucleotide polymorphism (SNP) subset and the choice of adjustment model. We computed PCs for subsets of genome-wide SNPs with varying levels of LD. The first two PCs were similar for all subsets and the first three PCs were associated with case status for all subsets. When the PCs associated with case status were included as covariates in an association model, the reduction in genomic inflation factor was similar for all SNP sets. Several models have been proposed to account for structure using PCs, but it is not yet clear whether the different methods will result in substantively different results for association studies with individuals of European descent. We compared genome-wide association p-values and results for two positive-control SNPs previously associated with rheumatoid arthritis using four PC adjustment methods as well as no adjustment and genomic control. We found that in this sample, adjusting for the continuous PCs or adjusting for discrete clusters identified using the PCs adequately accounts for the case-control population structure, but that a recently proposed randomization test performs poorly.
PMCID: PMC2795879  PMID: 20017972
PLoS Medicine  2014;11(12):e1001765.
In this study, Wurtz and colleagues investigated to what extent elevated body mass index (BMI) within the normal weight range has causal influences on the detailed systemic metabolite profile in early adulthood using Mendelian randomization analysis.
Please see later in the article for the Editors' Summary
Background
Increased adiposity is linked with higher risk for cardiometabolic diseases. We aimed to determine to what extent elevated body mass index (BMI) within the normal weight range has causal effects on the detailed systemic metabolite profile in early adulthood.
Methods and Findings
We used Mendelian randomization to estimate causal effects of BMI on 82 metabolic measures in 12,664 adolescents and young adults from four population-based cohorts in Finland (mean age 26 y, range 16–39 y; 51% women; mean ± standard deviation BMI 24±4 kg/m2). Circulating metabolites were quantified by high-throughput nuclear magnetic resonance metabolomics and biochemical assays. In cross-sectional analyses, elevated BMI was adversely associated with cardiometabolic risk markers throughout the systemic metabolite profile, including lipoprotein subclasses, fatty acid composition, amino acids, inflammatory markers, and various hormones (p<0.0005 for 68 measures). Metabolite associations with BMI were generally stronger for men than for women (median 136%, interquartile range 125%–183%). A gene score for predisposition to elevated BMI, composed of 32 established genetic correlates, was used as the instrument to assess causality. Causal effects of elevated BMI closely matched observational estimates (correspondence 87%±3%; R2 = 0.89), suggesting causative influences of adiposity on the levels of numerous metabolites (p<0.0005 for 24 measures), including lipoprotein lipid subclasses and particle size, branched-chain and aromatic amino acids, and inflammation-related glycoprotein acetyls. Causal analyses of certain metabolites and potential sex differences warrant stronger statistical power. Metabolite changes associated with change in BMI during 6 y of follow-up were examined for 1,488 individuals. Change in BMI was accompanied by widespread metabolite changes, which had an association pattern similar to that of the cross-sectional observations, yet with greater metabolic effects (correspondence 160%±2%; R2 = 0.92).
Conclusions
Mendelian randomization indicates causal adverse effects of increased adiposity with multiple cardiometabolic risk markers across the metabolite profile in adolescents and young adults within the non-obese weight range. Consistent with the causal influences of adiposity, weight changes were paralleled by extensive metabolic changes, suggesting a broadly modifiable systemic metabolite profile in early adulthood.
Please see later in the article for the Editors' Summary
Editors' Summary
Background
Adiposity—having excessive body fat—is a growing global threat to public health. Body mass index (BMI, calculated by dividing a person's weight in kilograms by their height in meters squared) is a coarse indicator of excess body weight, but the measure is useful in large population studies. Compared to people with a lean body weight (a BMI of 18.5–24.9 kg/m2), individuals with higher BMI have an elevated risk of developing life-shortening cardiometabolic diseases—cardiovascular diseases that affect the heart and/or the blood vessels (for example, heart failure and stroke) and metabolic diseases that affect the cellular chemical reactions that sustain life (for example, diabetes). People become unhealthily fat by consuming food and drink that contains more energy (calories) than they need for their daily activities. So adiposity can be prevented and reversed by eating less and exercising more.
Why Was This Study Done?
Epidemiological studies, which record the patterns of risk factors and disease in populations, suggest that the illness and death associated with excess body weight is partly attributable to abnormalities in how individuals with high adiposity metabolize carbohydrates and fats, leading to higher blood sugar and cholesterol levels. Further, adiposity is also associated with many other deviations in the metabolic profile than these commonly measured risk factors. However, epidemiological studies cannot prove that adiposity causes specific changes in a person's systemic (overall) metabolic profile because individuals with high BMI may share other characteristics (confounding factors) that are the actual causes of both adiposity and metabolic abnormalities. Moreover, having a change in some aspect of metabolism could also lead to adiposity, rather than vice versa (reverse causation). Importantly, if there is a causal effect of adiposity on cardiometabolic risk factor levels, it might be possible to prevent the progression towards cardiometabolic diseases by weight loss. Here, the researchers use “Mendelian randomization” to examine whether increased BMI within the normal and overweight range is causally influencing the metabolic risk factors from many biological pathways during early adulthood. Because gene variants are inherited randomly, they are not prone to confounding and are free from reverse causation. Several gene variants are known to lead to modestly increased BMI. Thus, an investigation of the associations between these gene variants and risk factors across the systemic metabolite profile in a population of healthy individuals can indicate whether higher BMI is causally related to known and novel metabolic risk factors and higher cardiometabolic disease risk.
What Did the Researchers Do and Find?
The researchers measured the BMI of 12,664 adolescents and young adults (average BMI 24.7 kg/m2) living in Finland and the blood levels of 82 metabolites in these young individuals at a single time point. Statistical analysis of these data indicated that elevated BMI was adversely associated with numerous cardiometabolic risk factors. For example, elevated BMI was associated with raised levels of low-density lipoprotein, “bad” cholesterol that increases cardiovascular disease risk. Next, the researchers used a gene score for predisposition to increased BMI, composed of 32 gene variants correlated with increased BMI, as an “instrumental variable” to assess whether adiposity causes metabolite abnormalities. The effects on the systemic metabolite profile of a 1-kg/m2 increment in BMI due to genetic predisposition closely matched the effects of an observed 1-kg/m2 increment in adulthood BMI on the metabolic profile. That is, higher levels of adiposity had causal effects on the levels of numerous blood-based metabolic risk factors, including higher levels of low-density lipoprotein cholesterol and triglyceride-carrying lipoproteins, protein markers of chronic inflammation and adverse liver function, impaired insulin sensitivity, and elevated concentrations of several amino acids that have recently been linked with the risk for developing diabetes. Elevated BMI also causally led to lower levels of certain high-density lipoprotein lipids in the blood, a marker for the risk of future cardiovascular disease. Finally, an examination of the metabolic changes associated with changes in BMI in 1,488 young adults after a period of six years showed that those metabolic measures that were most strongly associated with BMI at a single time point likewise displayed the highest responsiveness to weight change over time.
What Do These Findings Mean?
These findings suggest that increased adiposity has causal adverse effects on multiple cardiometabolic risk markers in non-obese young adults beyond the effects on cholesterol and blood sugar. Like all Mendelian randomization studies, the reliability of the causal association reported here depends on several assumptions made by the researchers. Nevertheless, these findings suggest that increased adiposity has causal adverse effects on multiple cardiometabolic risk markers in non-obese young adults. Importantly, the results of both the causal effect analyses and the longitudinal study suggest that there is no threshold below which a BMI increase does not adversely affect the metabolic profile, and that a systemic metabolic profile linked with high cardiometabolic disease risk that becomes established during early adulthood can be reversed. Overall, these findings therefore highlight the importance of weight reduction as a key target for metabolic risk factor control among young adults.
Additional Information
Please access these websites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.1001765.
The Computational Medicine Research Team of the University of Oulu has a webpage that provides further information on metabolite profiling by high-throughput NMR metabolomics
The World Health Organization provides information on obesity (in several languages)
The Global Burden of Disease Study website provides the latest details about global obesity trends
The UK National Health Service Choices website provides information about obesity, cardiovascular disease, and type 2 diabetes (including some personal stories)
The American Heart Association provides information on all aspects of cardiovascular disease and diabetes and on keeping healthy; its website includes personal stories about heart attacks, stroke, and diabetes
The US Centers for Disease Control and Prevention has information on all aspects of overweight and obesity and information about heart disease, stroke, and diabetes
MedlinePlus provides links to other sources of information on heart disease, vascular disease, and obesity (in English and Spanish)
Wikipedia has a page on Mendelian randomization (note: Wikipedia is a free online encyclopedia that anyone can edit; available in several languages)
doi:10.1371/journal.pmed.1001765
PMCID: PMC4260795  PMID: 25490400
PLoS Medicine  2007;4(4):e125.
Background
The epidermal growth factor receptor (EGFR) gene is the prototype member of the type I receptor tyrosine kinase (TK) family and plays a pivotal role in cell proliferation and differentiation. There are three well described polymorphisms that are associated with increased protein production in experimental systems: a polymorphic dinucleotide repeat (CA simple sequence repeat 1 [CA-SSR1]) in intron one (lower number of repeats) and two single nucleotide polymorphisms (SNPs) in the promoter region, −216 (G/T or T/T) and −191 (C/A or A/A). The objective of this study was to examine distributions of these three polymorphisms and their relationships to each other and to EGFR gene mutations and allelic imbalance (AI) in non-small cell lung cancers.
Methods and Findings
We examined the frequencies of the three polymorphisms of EGFR in 556 resected lung cancers and corresponding non-malignant lung tissues from 336 East Asians, 213 individuals of Northern European descent, and seven of other ethnicities. We also studied the EGFR gene in 93 corresponding non-malignant lung tissue samples from European-descent patients from Italy and in peripheral blood mononuclear cells from 250 normal healthy US individuals enrolled in epidemiological studies including individuals of European descent, African–Americans, and Mexican–Americans. We sequenced the four exons (18–21) of the TK domain known to harbor activating mutations in tumors and examined the status of the CA-SSR1 alleles (presence of heterozygosity, repeat number of the alleles, and relative amplification of one allele) and allele-specific amplification of mutant tumors as determined by a standardized semiautomated method of microsatellite analysis. Variant forms of SNP −216 (G/T or T/T) and SNP −191 (C/A or A/A) (associated with higher protein production in experimental systems) were less frequent in East Asians than in individuals of other ethnicities (p < 0.001). Both alleles of CA-SSR1 were significantly longer in East Asians than in individuals of other ethnicities (p < 0.001). Expression studies using bronchial epithelial cultures demonstrated a trend towards increased mRNA expression in cultures having the variant SNP −216 G/T or T/T genotypes. Monoallelic amplification of the CA-SSR1 locus was present in 30.6% of the informative cases and occurred more often in individuals of East Asian ethnicity. AI was present in 44.4% (95% confidence interval: 34.1%–54.7%) of mutant tumors compared with 25.9% (20.6%–31.2%) of wild-type tumors (p = 0.002). The shorter allele in tumors with AI in East Asian individuals was selectively amplified (shorter allele dominant) more often in mutant tumors (75.0%, 61.6%–88.4%) than in wild-type tumors (43.5%, 31.8%–55.2%, p = 0.003). In addition, there was a strong positive association between AI ratios of CA-SSR1 alleles and AI of mutant alleles.
Conclusions
The three polymorphisms associated with increased EGFR protein production (shorter CA-SSR1 length and variant forms of SNPs −216 and −191) were found to be rare in East Asians as compared to other ethnicities, suggesting that the cells of East Asians may make relatively less intrinsic EGFR protein. Interestingly, especially in tumors from patients of East Asian ethnicity, EGFR mutations were found to favor the shorter allele of CA-SSR1, and selective amplification of the shorter allele of CA-SSR1 occurred frequently in tumors harboring a mutation. These distinct molecular events targeting the same allele would both be predicted to result in greater EGFR protein production and/or activity. Our findings may help explain to some of the ethnic differences observed in mutational frequencies and responses to TK inhibitors.
Masaharu Nomura and colleagues examine the distribution ofEGFR polymorphisms in different populations and find differences that might explain different responses to tyrosine kinase inhibitors in lung cancer patients.
Editors' Summary
Background.
Most cases of lung cancer—the leading cause of cancer deaths worldwide—are “non-small cell lung cancer” (NSCLC), which has a very low cure rate. Recently, however, “targeted” therapies have brought new hope to patients with NSCLC. Like all cancers, NSCLC occurs when cells begin to divide uncontrollably because of changes (mutations) in their genetic material. Chemotherapy drugs treat cancer by killing these rapidly dividing cells, but, because some normal tissues are sensitive to these agents, it is hard to kill the cancer completely without causing serious side effects. Targeted therapies specifically attack the changes in cancer cells that allow them to divide uncontrollably, so it might be possible to kill the cancer cells selectively without damaging normal tissues. Epidermal growth factor receptor (EGRF) was one of the first molecules for which a targeted therapy was developed. In normal cells, messenger proteins bind to EGFR and activate its “tyrosine kinase,” an enzyme that sticks phosphate groups on tyrosine (an amino acid) in other proteins. These proteins then tell the cell to divide. Alterations to this signaling system drive the uncontrolled growth of some cancers, including NSCLC.
Why Was This Study Done?
Molecules that inhibit the tyrosine kinase activity of EGFR (for example, gefitinib) dramatically shrink some NSCLCs, particularly those in East Asian patients. Tumors shrunk by tyrosine kinase inhibitors (TKIs) often (but not always) have mutations in EGFR's tyrosine kinase. However, not all tumors with these mutations respond to TKIs, and other genetic changes—for example, amplification (multiple copies) of the EGFR gene—also affect tumor responses to TKIs. It would be useful to know which genetic changes predict these responses when planning treatments for NSCLC and to understand why the frequency of these changes varies between ethnic groups. In this study, the researchers have examined three polymorphisms—differences in DNA sequences that occur between individuals—in the EGFR gene in people with and without NSCLC. In addition, they have looked for associations between these polymorphisms, which are present in every cell of the body, and the EGFR gene mutations and allelic imbalances (genes occur in pairs but amplification or loss of one copy, or allele, often causes allelic imbalance in tumors) that occur in NSCLCs.
What Did the Researchers Do and Find?
The researchers measured how often three EGFR polymorphisms (the length of a repeat sequence called CA-SSR1, and two single nucleotide variations [SNPs])—all of which probably affect how much protein is made from the EGFR gene—occurred in normal tissue and NSCLC tissue from East Asians and individuals of European descent. They also looked for mutations in the EGFR tyrosine kinase and allelic imbalance in the tumors, and then determined which genetic variations and alterations tended to occur together in people with the same ethnicity. Among many associations, the researchers found that shorter alleles of CA-SSR1 and the minor forms of the two SNPs occurred less often in East Asians than in individuals of European descent. They also confirmed that EGFR kinase mutations were more common in NSCLCs in East Asians than in European-descent individuals. Furthermore, mutations occurred more often in tumors with allelic imbalance, and in tumors where there was allelic imbalance and an EGFR mutation, the mutant allele was amplified more often than the wild-type allele.
What Do These Findings Mean?
The researchers use these associations between gene variants and tumor-associated alterations to propose a model to explain the ethnic differences in mutational frequencies and responses to TKIs seen in NSCLC. They suggest that because of the polymorphisms in the EGFR gene commonly seen in East Asians, people from this ethnic group make less EGFR protein than people from other ethnic groups. This would explain why, if a threshold level of EGFR is needed to drive cells towards malignancy, East Asians have a high frequency of amplified EGFR tyrosine kinase mutations in their tumors—mutation followed by amplification would be needed to activate EGFR signaling. This model, though speculative, helps to explain some clinical findings, such as the frequency of EGFR mutations and of TKI sensitivity in NSCLCs in East Asians. Further studies of this type in different ethnic groups and in different tumors, as well as with other genes for which targeted therapies are available, should help oncologists provide personalized cancer therapies for their patients.
Additional Information.
Please access these Web sites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.0040125.
US National Cancer Institute information on lung cancer and on cancer treatment for patients and professionals
MedlinePlus encyclopedia entries on NSCLC
Cancer Research UK information for patients about all aspects of lung cancer, including treatment with TKIs
Wikipedia pages on lung cancer, EGFR, and gefitinib (note that Wikipedia is a free online encyclopedia that anyone can edit)
doi:10.1371/journal.pmed.0040125
PMCID: PMC1876407  PMID: 17455987
In genetic association studies, it is necessary to correct for population structure to avoid inference bias. During the past decade, prevailing corrections often only involved adjustments of global ancestry differences between sampled individuals. Nevertheless, population structure may vary across local genomic regions due to the variability of local ancestries associated with natural selection, migration, or random genetic drift. Adjusting for global ancestry alone may be inadequate when local population structure is an important confounding factor. In contrast, adjusting for local ancestry can more effectively prevent false-positives due to local population structure. To more accurately locate disease genes, we recommend adjusting for local ancestries by interrogating local structure. In practice, locus-specific ancestries are usually unknown and cannot be accurately inferred when ancestral population information is not available. For such scenarios, we propose employing local principal components (PC) to represent local ancestries and adjusting for local PCs when testing for genotype–phenotype association. With an acceptable computation burden, the proposed algorithm successfully eliminates the known spurious association between SNPs in the LCT gene and height due to the population structure in European Americans.
doi:10.1007/978-1-61779-555-8_21
PMCID: PMC3589145  PMID: 22307710
Genome-wide association studies; Local ancestries; Local principal components; Migration; Random genetic drift; Natural selection; Genomic inflation factor; Genomic control; Local ancestry principal components correction; Fine mapping
PLoS ONE  2013;8(10):e77720.
Complex human diseases commonly differ in their phenotypic characteristics, e.g., Crohn’s disease (CD) patients are heterogeneous with regard to disease location and disease extent. The genetic susceptibility to Crohn’s disease is widely acknowledged and has been demonstrated by identification of over 100 CD associated genetic loci. However, relating CD subphenotypes to disease susceptible loci has proven to be a difficult task. In this paper we discuss the use of cluster analysis on genetic markers to identify genetic-based subgroups while taking into account possible confounding by population stratification. We show that it is highly relevant to consider the confounding nature of population stratification in order to avoid that detected clusters are strongly related to population groups instead of disease-specific groups. Therefore, we explain the use of principal components to correct for population stratification while clustering affected individuals into genetic-based subgroups. The principal components are obtained using 30 ancestry informative markers (AIM), and the first two PCs are determined to discriminate between continental origins of the affected individuals. Genotypes on 51 CD associated single nucleotide polymorphisms (SNPs) are used to perform latent class analysis, hierarchical and Partitioning Around Medoids (PAM) cluster analysis within a sample of affected individuals with and without the use of principal components to adjust for population stratification. It is seen that without correction for population stratification clusters seem to be influenced by population stratification while with correction clusters are unrelated to continental origin of individuals.
doi:10.1371/journal.pone.0077720
PMCID: PMC3798408  PMID: 24147066
PLoS Medicine  2014;11(7):e1001669.
In this study, Granell and colleagues used Mendelian randomization to investigate causal effects of BMI, fat mass, and lean mass on current asthma at age 7½ years in the Avon Longitudinal Study of Parents and Children (ALSPAC) and found that higher BMI increases the risk of asthma in mid-childhood.
Please see later in the article for the Editors' Summary
Background
Observational studies have reported associations between body mass index (BMI) and asthma, but confounding and reverse causality remain plausible explanations. We aim to investigate evidence for a causal effect of BMI on asthma using a Mendelian randomization approach.
Methods and Findings
We used Mendelian randomization to investigate causal effects of BMI, fat mass, and lean mass on current asthma at age 7½ y in the Avon Longitudinal Study of Parents and Children (ALSPAC). A weighted allele score based on 32 independent BMI-related single nucleotide polymorphisms (SNPs) was derived from external data, and associations with BMI, fat mass, lean mass, and asthma were estimated. We derived instrumental variable (IV) estimates of causal risk ratios (RRs). 4,835 children had available data on BMI-associated SNPs, asthma, and BMI. The weighted allele score was strongly associated with BMI, fat mass, and lean mass (all p-values<0.001) and with childhood asthma (RR 2.56, 95% CI 1.38–4.76 per unit score, p = 0.003). The estimated causal RR for the effect of BMI on asthma was 1.55 (95% CI 1.16–2.07) per kg/m2, p = 0.003. This effect appeared stronger for non-atopic (1.90, 95% CI 1.19–3.03) than for atopic asthma (1.37, 95% CI 0.89–2.11) though there was little evidence of heterogeneity (p = 0.31). The estimated causal RRs for the effects of fat mass and lean mass on asthma were 1.41 (95% CI 1.11–1.79) per 0.5 kg and 2.25 (95% CI 1.23–4.11) per kg, respectively. The possibility of genetic pleiotropy could not be discounted completely; however, additional IV analyses using FTO variant rs1558902 and the other BMI-related SNPs separately provided similar causal effects with wider confidence intervals. Loss of follow-up was unlikely to bias the estimated effects.
Conclusions
Higher BMI increases the risk of asthma in mid-childhood. Higher BMI may have contributed to the increase in asthma risk toward the end of the 20th century.
Please see later in the article for the Editors' Summary
Editors' Summary
Background
The global burden of asthma, a chronic (long-term) condition caused by inflammation of the airways (the tubes that carry air in and out of the lungs), has been rising steadily over the past few decades. It is estimated that, nowadays, 200–300 million adults and children worldwide are affected by asthma. Although asthma can develop at any age, it is often diagnosed in childhood—asthma is the most common chronic disease in children. In people with asthma, the airways can react very strongly to allergens such as animal fur or to irritants such as cigarette smoke, becoming narrower so that less air can enter the lungs. Exercise, cold air, and infections can also trigger asthma attacks, which can be fatal. The symptoms of asthma include wheezing, coughing, chest tightness, and shortness of breath. Asthma cannot be cured, but drugs can relieve its symptoms and prevent acute asthma attacks.
Why Was This Study Done?
We cannot halt the ongoing rise in global asthma rates without understanding the causes of asthma. Some experts think obesity may be one cause of asthma. Obesity, like asthma, is increasingly common, and observational studies (investigations that ask whether individuals exposed to a suspected risk factor for a condition develop that condition more often than unexposed individuals) in children have reported that body mass index (BMI, an indicator of body fat calculated by dividing a person's weight in kilograms by their height in meters squared) is positively associated with asthma. Observational studies cannot prove that obesity causes asthma because of “confounding.” Overweight children with asthma may share another unknown characteristic (confounder) that actually causes both obesity and asthma. Moreover, children with asthma may be less active than unaffected children, so they become overweight (reverse causality). Here, the researchers use “Mendelian randomization” to assess whether BMI has a causal effect on asthma. In Mendelian randomization, causality is inferred from associations between genetic variants that mimic the effect of a modifiable risk factor and the outcome of interest. Because gene variants are inherited randomly, they are not prone to confounding and are free from reverse causation. So, if a higher BMI leads to asthma, genetic variants associated with increased BMI should be associated with an increased risk of asthma.
What Did the Researchers Do and Find?
The researchers investigated causal effects of BMI, fat mass, and lean mass on current asthma at age 7½ years in 4,835 children enrolled in the Avon Longitudinal Study of Parents and Children (ALSPAC, a long-term health project that started in 1991). They calculated an allele score for each child based on 32 BMI-related genetic variants, and estimated associations between this score and BMI, fat mass and lean mass (both measured using a special type of X-ray scanner; in children BMI is not a good indicator of “fatness”), and asthma. They report that the allele score was strongly associated with BMI, fat mass, and lean mass, and with childhood asthma. The estimated causal relative risk (risk ratio) for the effect of BMI on asthma was 1.55 per kg/m2. That is, the relative risk of asthma increased by 55% for every extra unit of BMI. The estimated causal relative risks for the effects of fat mass and lean mass on asthma were 1.41 per 0.5 kg and 2.25 per kg, respectively.
What Do These Findings Mean?
These findings suggest that a higher BMI increases the risk of asthma in mid-childhood and that global increases in BMI toward the end of the 20th century may have contributed to the global increase in asthma that occurred at the same time. It is possible that the observed association between BMI and asthma reported in this study is underpinned by “genetic pleiotropy” (a potential limitation of all Mendelian randomization analyses). That is, some of the genetic variants included in the BMI allele score could conceivably also increase the risk of asthma. Nevertheless, these findings suggest that public health interventions designed to reduce obesity may also help to limit the global rise in asthma.
Additional Information
Please access these websites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.1001669.
The US Centers for Disease Control and Prevention provides information on asthma and on all aspects of overweight and obesity (in English and Spanish)
The World Health Organization provides information on asthma and on obesity (in several languages)
The UK National Health Service Choices website provides information about asthma, about asthma in children, and about obesity (including real stories)
The Global Asthma Report 2011 is available
The Global Initiative for Asthma released its updated Global Strategy for Asthma Management and Prevention on World Asthma Day 2014
Information about the Avon Longitudinal Study of Parents and Children is available
MedlinePlus provides links to further information on obesity in children, on asthma, and on asthma in children (in English and Spanish
Wikipedia has a page on Mendelian randomization (note: Wikipedia is a free online encyclopedia that anyone can edit; available in several languages)
doi:10.1371/journal.pmed.1001669
PMCID: PMC4077660  PMID: 24983943

Results 1-25 (1204902)