PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (34)
 

Clipboard (0)
None
Journals
more »
Year of Publication
1.  Data sharing in large research consortia: experiences and recommendations from ENGAGE 
Data sharing is essential for the conduct of cutting-edge research and is increasingly required by funders concerned with maximising the scientific yield from research data collections. International research consortia are encouraged to share data intra-consortia, inter-consortia and with the wider scientific community. Little is reported regarding the factors that hinder or facilitate data sharing in these different situations. This paper provides results from a survey conducted in the European Network for Genetic and Genomic Epidemiology (ENGAGE) that collected information from its participating institutions about their data-sharing experiences. The questionnaire queried about potential hurdles to data sharing, concerns about data sharing, lessons learned and recommendations for future collaborations. Overall, the survey results reveal that data sharing functioned well in ENGAGE and highlight areas that posed the most frequent hurdles for data sharing. Further challenges arise for international data sharing beyond the consortium. These challenges are described and steps to help address these are outlined.
doi:10.1038/ejhg.2013.131
PMCID: PMC3925260  PMID: 23778872
biobanks; data sharing; consortia; genetic research
2.  Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits 
Nature genetics  2012;44(4):369-S3.
We present an approximate conditional and joint association analysis that can use summary-level statistics from a meta-analysis of genome-wide association studies (GWAS) and estimated linkage disequilibrium (LD) from a reference sample with individual-level genotype data. Using this method, we analyzed meta-analysis summary data from the GIANT Consortium for height and body mass index (BMI), with the LD structure estimated from genotype data in two independent cohorts. We identified 36 loci with multiple associated variants for height (38 leading and 49 additional SNPs, 87 in total) via a genome-wide SNP selection procedure. The 49 new SNPs explain approximately 1.3% of variance, nearly doubling the heritability explained at the 36 loci. We did not find any locus showing multiple associated SNPs for BMI. The method we present is computationally fast and is also applicable to case-control data, which we demonstrate in an example from meta-analysis of type 2 diabetes by the DIAGRAM Consortium.
doi:10.1038/ng.2213
PMCID: PMC3593158  PMID: 22426310
3.  Ethnic variation in the activity of lipid desaturases and their relationships with cardiovascular risk factors in control women and an at-risk group with previous gestational diabetes mellitus: a cross-sectional study 
Background
Lipid desaturase enzymes mediate the metabolism of fatty acids to long chain polyunsaturated fatty acids and their activities are related to metabolic risk factors for Type 2 diabetes (T2DM) and coronary heart disease (CHD). There are marked ethnic differences in risks of CHD and T2DM but little is known about ethnic differences in desaturase activities.
Methods
Samples from a study of CVD risk in women with previous gestational diabetes were analysed for percentage fatty acids in plasma free fatty acid, triglyceride, cholesterol ester and phospholipid pools for 89 white European, 53 African Caribbean and 56 Asian Indian women. The fatty acid desaturase activities, stearoyl-CoA desaturase (SCD, calculated separately for C16 and C18 fatty acids), delta 6 desaturase (D6D) and delta 5 desaturase (D5D) were estimated from precursor-to-product ratios and their relationships with adiposity, blood pressure, cholesterol, triglycerides, HDL cholesterol and insulin sensitivity explored. Ethnic differences in desaturase activities independent of ethnic variation in risk factor correlates of desaturase activities were then identified.
Results
There was significant ethnic variation in age, BMI, waist circumference, blood pressure, serum triglycerides and HDL cholesterol concentrations and insulin resistance. Desaturase activities showed significant correlations, independent of ethnicity, with BMI, waist circumference, triglycerides and HDL cholesterol. Independent of ethnic variation in BMI, waist circumference, triglycerides and HDL cholesterol, SCD-16 activity, calculated from each of the four lipid pools measured, was 18–35 percent higher in white Europeans than in African Caribbeans or Asian Indians (all p < 0.001). Similar, though less consistent differences were apparent for SCD-18 activity. Also independently of risk factor variation, but specifically when calculated from the cholesterol ester and phospholipid, pools, D6D activity was significantly lower in Asian Indians, and D5D activity higher in African Caribbeans.
Conclusions
Significant ethnic differences exist in desaturase activities, independently of ethnic variation in other risk factors. These characteristics did not accord with higher risk of T2DM among African Caribbeans and Asian Indians nor with lower risk of CHD among African Caribbeans but did accord with the higher risk of CHD in Asian Indians.
doi:10.1186/1476-511X-12-25
PMCID: PMC3605319  PMID: 23496836
Ethnicity; Lipids; Blood pressure; Insulin resistance; Stearoyl-CoA desaturase; Delta 6 desaturase; Delta 5 desaturase; Fatty acids; Desaturase activities; Triglycerides; HDL cholesterol; Insulin resistance
4.  Epigenetic silencing of HNF1A associates with changes in the composition of the human plasma N-glycome 
Epigenetics  2012;7(2):164-172.
Protein glycosylation is a ubiquitous modification that affects the structure and function of proteins. Our recent genome wide association study identified transcription factor HNF1A as an important regulator of plasma protein glycosylation. To evaluate the potential impact of epigenetic regulation of HNF1A on protein glycosylation we analyzed CpG methylation in 810 individuals. The association between methylation of four CpG sites and the composition of plasma and IgG glycomes was analyzed. Several statistically significant associations were observed between HNF1A methylation and plasma glycans, while there were no significant associations with IgG glycans. The most consistent association with HNF1A methylation was observed with the increase in the proportion of highly branched glycans in the plasma N-glycome. The hypothesis that inactivation of HNF1A promotes branching of glycans was supported by the analysis of plasma N-glycomes in 61 patients with inactivating mutations in HNF1A, where the increase in plasma glycan branching was also observed. This study represents the first demonstration of epigenetic regulation of plasma glycome composition, suggesting a potential mechanism by which epigenetic deregulation of the glycome may contribute to disease development.
doi:10.4161/epi.7.2.18918
PMCID: PMC3335910  PMID: 22395466
protein glycosylation; plasma glycome; HNF1A; CpG methylation; epigenetics
5.  Transcriptome and genome sequencing uncovers functional variation in humans 
Nature  2013;501(7468):506-511.
Summary
Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of mRNA and miRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project – the first uniformly processed RNA-seq data from multiple human populations with high-quality genome sequences. We discovered extremely widespread genetic variation affecting regulation of the majority of genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on cellular mechanisms of regulatory and loss-of-function variation, and allowed us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome.
doi:10.1038/nature12531
PMCID: PMC3918453  PMID: 24037378
7.  Genome-wide association study identifies variants in TMPRSS6 associated with hemoglobin levels 
Nature genetics  2009;41(11):1170-1172.
We carried out a genome-wide association study of hemoglobin levels in 16,001 individuals of European and Indian Asian ancestry. The most closely associated SNP (rs855791) results in nonsynonymous (V736A) change in the serine protease domain of TMPRSS6 and a blood hemoglobin concentration 0.13 (95% CI 0.09–0.17) g/dl lower per copy of allele A (P = 1.6 × 10−13). Our findings suggest that TMPRSS6, a regulator of hepcidin synthesis and iron handling, is crucial in hemoglobin level maintenance.
doi:10.1038/ng.462
PMCID: PMC3178047  PMID: 19820698
8.  Mendelian Randomization Studies do not Support a Role for Raised Circulating Triglyceride Levels influencing Type 2 Diabetes, Glucose Levels, or Insulin Resistance 
Diabetes  2011;60(3):1008-1018.
Objective
The causal nature of associations between circulating triglycerides, insulin resistance and type 2 diabetes is unclear. We aimed to use Mendelian randomization to test the hypothesis that raised circulating triglyceride levels causally influence the risk of type 2 diabetes, raised normal fasting glucose levels, and hepatic insulin resistance.
Research design and methods
We tested 10 common genetic variants robustly associated with circulating triglyceride levels against type 2 diabetes status in 5637 cases, 6860 controls, and four continuous outcomes (reflecting glycemia and hepatic insulin resistance) in 8271 non-diabetic individuals from four studies.
Results
Individuals carrying greater numbers of triglyceride-raising alleles had increased circulating triglyceride levels (0.59 SD [95% CI: 0.52, 0.65] difference between the 20% of individuals with the most alleles and the 20% with the fewest alleles). There was no evidence that carriers of greater numbers of triglyceride-raising alleles were at increased risk of type 2 diabetes (per weighted allele odds ratio (OR) 0.99 [95% CI: 0.97, 1.01]; P = 0.26). In non-diabetic individuals, there was no evidence that carriers of greater numbers of triglyceride-raising alleles had increased fasting insulin levels (0.00 SD per weighted allele [95% CI: −0.01, 0.02]; P = 0.72) or increased fasting glucose levels (0.00 SD per weighted allele [95% CI: −0.01, 0.01]; P = 0.88). Instrumental variable analyses confirmed that genetically raised circulating triglyceride levels were not associated with increased diabetes risk, fasting glucose or fasting insulin, and, for diabetes, showed a trend towards a protective association (OR per 1 SD increase in log10-triglycerides: 0.61 [95% CI: 0.45, 0.83]; P = 0.002).
Conclusion
Genetically raised circulating triglyceride levels do not increase the risk of type 2 diabetes, or raise fasting glucose or fasting insulin levels in non-diabetic individuals. One explanation for our results is that raised circulating triglycerides are predominantly secondary to the diabetes disease process rather than causal.
doi:10.2337/db10-1317
PMCID: PMC3046819  PMID: 21282362
9.  Circulating β-carotene levels and Type 2 diabetes: Cause or effect? 
Diabetologia  2009;52(10):2117-2121.
Aims and Hypothesis
Circulating β-carotene levels are inversely associated with type 2 diabetes risk, but the causal direction of this association is not certain. In this study we used a Mendelian Randomization approach to provide evidence for or against the causal role of the anti-oxidant vitamin β-carotene in type 2 diabetes.
Methods
We used a common polymorphism (rs6564851) near the β-carotene 15,15'-Monooxygenase 1 (BCMO1) gene that is strongly associated with circulating β-carotene levels (P = 2×10−24) - each G allele is associated with a 0.27 standard deviation increase in levels. We used data from the InCHIANTI study and the ULSAM study to estimate the association between β-carotene levels and type 2 diabetes. We next used a triangulation approach to estimate the expected effect of rs6564851 on type 2 diabetes risk, and compared this to the observed effect using data from 4549 type 2 diabetes cases and 5579 controls from the DIAGRAM consortium.
Results
A 0.27 standard deviation increase in β-carotene levels is associated with an odds ratio of 0.90 (0.86–0.95) for type 2 diabetes in the InCHIANTI study. This association is similar to that of the ULSAM study, OR (0.90 (0.84–0.97)). In contrast there was no association between rs6564851 and type 2 diabetes (OR 0.98 (0.93–1.04, P = 0.58), and this effect size was smaller than that expected given the known associations between rs6564851 and β-carotene levels and the associations between β-carotene levels and type 2 diabetes.
Conclusion
Our Mendelian Randomization studies are in keeping with randomized controlled trials that suggest β-carotene is not causally protective against type 2 diabetes.
doi:10.1007/s00125-009-1475-8
PMCID: PMC2746424  PMID: 19662379
type 2 diabetes; β-carotene; mendelian randomization
10.  Exploring the unknown: assumptions about allelic architecture and strategies for susceptibility variant discovery 
Genome Medicine  2009;1(7):66.
Identification of common-variant associations for many common disorders has been highly effective, but the loci detected so far typically explain only a small proportion of the genetic predisposition to disease. Extending explained genetic variance is one of the major near-term goals of human genetic research. Next-generation sequencing technologies offer great promise, but optimal strategies for their deployment remain uncertain, not least because we lack a clear view of the characteristics of the variants being sought. Here, I discuss what can and cannot be inferred about complex trait disease architecture from the information currently available and review the implications for future research strategies.
doi:10.1186/gm66
PMCID: PMC2717392  PMID: 19591663
11.  Genome-wide association studies in type 2 diabetes 
Current diabetes reports  2009;9(2):164-171.
Despite numerous candidate gene and linkage studies, the field of type 2 diabetes (T2D) genetics had until recently succeeded in identifying few genuine disease-susceptibility loci. The advent of genome-wide association (GWA) scans has transformed the situation, leading to an expansion in the number of established, robustly replicating T2D loci to almost 20. These novel findings offer unique insights into the pathogenesis of T2D and in the main point towards the etiological importance of disorders of beta-cell development and function. All associated variants have common allele frequencies in the discovery populations, and exert modest to small effects on the risk of disease, characteristics which limit their prognostic and diagnostic potential. However, ongoing studies focussing on the role of copy number variation and targeting low frequency polymorphisms should identify additional T2D-susceptibility loci, some of which may have larger effect sizes and offer better individual prediction of disease risk.
PMCID: PMC2694564  PMID: 19323962
12.  A System for Information Management in BioMedical Studies—SIMBioMS 
Bioinformatics  2009;25(20):2768-2769.
Summary: SIMBioMS is a web-based open source software system for managing data and information in biomedical studies. It provides a solution for the collection, storage, management and retrieval of information about research subjects and biomedical samples, as well as experimental data obtained using a range of high-throughput technologies, including gene expression, genotyping, proteomics and metabonomics. The system can easily be customized and has proven to be successful in several large-scale multi-site collaborative projects. It is compatible with emerging functional genomics data standards and provides data import and export in accepted standard formats. Protocols for transferring data to durable archives at the European Bioinformatics Institute have been implemented.
Availability: The source code, documentation and initialization scripts are available at http://simbioms.org.
Contact: support@simbioms.org; mariak@ebi.ac.uk
doi:10.1093/bioinformatics/btp420
PMCID: PMC2759553  PMID: 19633095
13.  Common variants in WFS1 confer risk of type 2 diabetes 
Nature genetics  2007;39(8):951-953.
We studied genes involved in pancreatic β cell function and survival, identifying associations between SNPs in WFS1 and diabetes risk in UK populations that we replicated in an Ashkenazi population and in additional UK studies. In a pooled analysis comprising 9,533 cases and 11,389 controls, SNPs in WFS1 were strongly associated with diabetes risk. Rare mutations in WFS1 cause Wolfram syndrome; using a gene-centric approach, we show that variation in WFS1 also predisposes to common type 2 diabetes.
doi:10.1038/ng2067
PMCID: PMC2672152  PMID: 17603484
14.  Exploring the Developmental Overnutrition Hypothesis Using Parental–Offspring Associations and FTO as an Instrumental Variable 
PLoS Medicine  2008;5(3):e33.
Background
The developmental overnutrition hypothesis suggests that greater maternal obesity during pregnancy results in increased offspring adiposity in later life. If true, this would result in the obesity epidemic progressing across generations irrespective of environmental or genetic changes. It is therefore important to robustly test this hypothesis.
Methods and Findings
We explored this hypothesis by comparing the associations of maternal and paternal pre-pregnancy body mass index (BMI) with offspring dual energy X-ray absorptiometry (DXA)–determined fat mass measured at 9 to 11 y (4,091 parent–offspring trios) and by using maternal FTO genotype, controlling for offspring FTO genotype, as an instrument for maternal adiposity. Both maternal and paternal BMI were positively associated with offspring fat mass, but the maternal association effect size was larger than that in the paternal association in all models: mean difference in offspring sex- and age-standardised fat mass z-score per 1 standard deviation BMI 0.24 (95% confidence interval [CI]: 0.22 to 0.26) for maternal BMI versus 0.13 (95% CI: 0.11, 0.15) for paternal BMI; p-value for difference in effect < 0.001. The stronger maternal association was robust to sensitivity analyses assuming levels of non-paternity up to 20%. When maternal FTO, controlling for offspring FTO, was used as an instrument for the effect of maternal adiposity, the mean difference in offspring fat mass z-score per 1 standard deviation maternal BMI was −0.08 (95% CI: −0.56 to 0.41), with no strong statistical evidence that this differed from the observational ordinary least squares analyses (p = 0.17).
Conclusions
Neither our parental comparisons nor the use of FTO genotype as an instrumental variable, suggest that greater maternal BMI during offspring development has a marked effect on offspring fat mass at age 9–11 y. Developmental overnutrition related to greater maternal BMI is unlikely to have driven the recent obesity epidemic.
Using parental-offspring associations and theFTO gene as an instrumental variable for maternal adiposity, Debbie Lawlor and colleagues found that greater maternal BMI during offspring development does not appear to have a marked effect on offspring fat mass at age 9-11.
Editors' Summary
Background.
Since the 1970s, the proportion of children and adults who are overweight or obese (people who have an unhealthy amount of body fat) has increased sharply in many countries. In the US, 1 in 3 adults is now obese; in the mid-1970s it was only 1 in 7. Similarly, the proportion of overweight children has risen from 1 in 20 to 1 in 5. An adult is considered to be overweight if their body mass index (BMI)—their weight in kilograms divided by their height in meters squared—is between 25 and 30, and obese if it is more than 30. For children, the healthy BMI depends on their age and gender. Compared to people with a healthy weight (a BMI between 18.5 and 25), overweight or obese individuals have an increased lifetime risk of developing diabetes and other adverse health conditions, sometimes becoming ill while they are still young. People become unhealthily fat when they consume food and drink that contains more energy than they need for their daily activities. It should, therefore, be possible to avoid becoming obese by having a healthy diet and exercising regularly.
Why Was This Study Done?
Some researchers think that “developmental overnutrition” may have caused the recent increase in waistline measurements. In other words, if a mother is overweight during pregnancy, high sugar and fat levels in her body might permanently affect her growing baby's appetite control and metabolism, and so her offspring might be at risk of becoming obese in later life. If this hypothesis is true, each generation will tend to be fatter than the previous one and it will be very hard to halt the obesity epidemic simply by encouraging people to eat less and exercise more. In this study, the researchers have used two approaches to test the developmental overnutrition hypothesis. First, they have asked whether offspring fat mass is more strongly related to maternal BMI than to paternal BMI; it should be if the hypothesis is true. Second, they have asked whether a genetic indicator of maternal fatness—the “A” variant of the FTO gene—is related to offspring fat mass. A statistical association between maternal FTO genotype (genetic make-up) and offspring fat mass would support the developmental nutrition hypothesis.
What Did the Researchers Do and Find?
In 1991–1992, the Avon Longitudinal Study of Parents and Children (ALSPAC) enrolled about 14,000 pregnant women and now examines their offspring at regular intervals. The researchers first used statistical methods to look for associations between the self-reported prepregnancy BMI of the parents of about 4,000 children and the children's fat mass at ages 9–11 years measured using a technique called dual energy X-ray absorptiometry. Both maternal and paternal BMI were positively associated with offspring fat mass (that is, fatter parents had fatter children) but the effect of maternal BMI was greater than the effect of paternal BMI. When the researchers examined maternal FTO genotypes and offspring fat mass (after allowing for the offspring's FTO genotype, which would directly affect their fat mass), there was no statistical evidence to suggest that differences in offspring fat mass were related to the maternal FTO genotype.
What Do These Findings Mean?
Although the findings from first approach provide some support for the development overnutrition hypothesis, the effect of maternal BMI on offspring fat mass is too weak to explain the recent obesity epidemic. Developmental overnutrition could, however, be responsible for the much slower increase in obesity that began a century ago. The findings from the second approach provide no support for the developmental overnutrition hypothesis, although these results have wide error margins and need confirming in a larger study. The researchers also note that the effects of developmental overnutrition on offspring fat mass, although weak at age 9–11, might become more important at later ages. Nevertheless, for now, it seems unlikely that developmental overnutrition has been a major driver of the recent obesity epidemic. Interventions that aim to improve people's diet and to increase their physical activity levels could therefore slow or even halt the epidemic.
Additional Information.
Please access these Web sites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.0050033.
See a related PLoS Medicine Perspective article
The MedlinePlus encyclopedia has a page on obesity (in English and Spanish)
The US Centers for Disease Control and Prevention provides information on all aspects of obesity (in English and Spanish)
The UK National Health Service's health Web site (NHS Direct) provides information about obesity
The International Obesity Taskforce provides information about preventing obesity and on childhood obesity
The UK Foods Standards Agency, the United States Department of Agriculture, and Shaping America's Health all provide useful advice about healthy eating for adults and children
The ALSPAC Web site provides information about the Avon Longitudinal Study of Parents and Children and its results so far
doi:10.1371/journal.pmed.0050033
PMCID: PMC2265763  PMID: 18336062
15.  Combining Information from Common Type 2 Diabetes Risk Polymorphisms Improves Disease Prediction 
PLoS Medicine  2006;3(10):e374.
Background
A limited number of studies have assessed the risk of common diseases when combining information from several predisposing polymorphisms. In most cases, individual polymorphisms only moderately increase risk (~20%), and they are thought to be unhelpful in assessing individuals' risk clinically. The value of analyzing multiple alleles simultaneously is not well studied. This is often because, for any given disease, very few common risk alleles have been confirmed.
Methods and Findings
Three common variants (Lys23 of KCNJ11, Pro12 of PPARG, and the T allele at rs7903146 of TCF7L2) have been shown to predispose to type 2 diabetes mellitus across many large studies. Risk allele frequencies ranged from 0.30 to 0.88 in controls. To assess the combined effect of multiple susceptibility alleles, we genotyped these variants in a large case-control study (3,668 controls versus 2,409 cases). Individual allele odds ratios (ORs) ranged from 1.14 (95% confidence interval [CI], 1.05 to 1.23) to 1.48 (95% CI, 1.36 to 1.60). We found no evidence of gene-gene interaction, and the risks of multiple alleles were consistent with a multiplicative model. Each additional risk allele increased the odds of type 2 diabetes by 1.28 (95% CI, 1.21 to 1.35) times. Participants with all six risk alleles had an OR of 5.71 (95% CI, 1.15 to 28.3) compared to those with no risk alleles. The 8.1% of participants that were double-homozygous for the risk alleles at TCF7L2 and Pro12Ala had an OR of 3.16 (95% CI, 2.22 to 4.50), compared to 4.3% with no TCF7L2 risk alleles and either no or one Glu23Lys or Pro12Ala risk alleles.
Conclusions
Combining information from several known common risk polymorphisms allows the identification of population subgroups with markedly differing risks of developing type 2 diabetes compared to those obtained using single polymorphisms. This approach may have a role in future preventative measures for common, polygenic diseases.
Combining information from several known common risk polymorphisms allows the identification of subgroups of the population with markedly differing risks of developing type 2 diabetes.
Editors' Summary
Background.
Diabetes is an important and increasingly common global health problem; the World Health Organization has estimated that about 170 million people currently have diabetes worldwide. One particular form, type 2 diabetes, develops when cells in the body become unable to respond to a hormone called insulin. Insulin is normally released by the pancreas and controls the ability of body cells to take in glucose (sugar). Therefore, when cells become insensitive to insulin as in people with type 2 diabetes, glucose levels in the body are not well controlled and may become dangerously high in the blood. These high levels can have long-term damaging effects on various organs in the body, particularly the eyes, nerves, heart, and kidneys. There are many different factors that affect whether someone is likely to develop type 2 diabetes. These factors can be broadly grouped into two categories: environmental and genetic. Environmental factors such as obesity, a diet high in sugar, and a sedentary lifestyle are all risk factors for developing type 2 diabetes in later life. Genetically, a number of variants in many different genes may affect the risk of developing the disease. Generally, these gene variants are common in human populations but each gene variant only mildly increases the risk that a person possessing it will get type 2 diabetes.
Why Was This Study Done?
The investigators performing this study wanted to understand how different gene variants combine to affect an individual's risk of getting type 2 diabetes. That is, if a person carries many different variants, does their overall risk increase a lot or only a little?
What Did the Researchers Do and Find?
First, the researchers surveyed the published reports to identify those gene variants for which there was strong evidence of an association with type 2 diabetes. They found mutations in three genes that had been shown reproducibly to be associated with type 2 diabetes in different studies: PPARG (whose product is involved in regulation of fat tissue), KCNJ11 (whose product is involved in insulin production), and TCF7L2 (whose product is thought to be involved in controlling sugar levels). Then, they compared two groups of white people in the UK: 2,409 people with type 2 diabetes (“cases”), and 3,668 people from the general population (“controls”). The researchers compared the two groups to see which individuals possessed which gene variants, and did statistical testing to work out to what extent having particular combinations of the gene variants affected an individual's chance of being a “case” versus a “control.” Their results showed that in the groups studied, having an ever-increasing number of gene variants increased the risk of developing diabetes. The risk that someone with none of the gene variants would develop type 2 diabetes was about 2%, while the chance for someone with all gene variants was about10%.
What Do These Findings Mean?
These results show that the risk of developing type 2 diabetes is greater if an individual possesses all of the gene variants that were examined in this study. The analysis also suggests that using information on all three variants, rather than just one, is likely to be more accurate in predicting future risk. How this genetic information should be used alongside other well-known preventative measures such as altered lifestyle requires further study.
Additional Information.
Please access these Web sites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.0030374.
NHS Direct patient information on diabetes
National Diabetes Information Clearinghouse information on type 2 diabetes
World Health Organization Diabetes Programme
Centers for Disease ControlDiabetes Public Health Resource
doi:10.1371/journal.pmed.0030374
PMCID: PMC1584415  PMID: 17020404
16.  Will the real disease gene please stand up? 
BMC Genetics  2005;6(Suppl 1):S66.
A common dilemma arising in linkage studies of complex genetic diseases is the selection of positive signals, their follow-up with association studies and discrimination between true and false positive results. Several strategies for overcoming these issues have been devised. Using the Genetic Analysis Workshop 14 simulated dataset, we aimed to apply different analytical approaches and evaluate their performance in discerning real associations. We considered a) haplotype analyses, b) different methods adjusting for multiple testing, c) replication in a second dataset, and d) exhaustive genotyping of all markers in a sufficiently powered, large sample group. We found that haplotype-based analyses did not substantially improve over single-point analysis, although this may reflect the low levels of linkage disequilibrium simulated in the datasets provided. Multiple testing correction methods were in general found to be over-conservative. Replication of nominally positive results in a second dataset appears to be less stringent, resulting in the follow-up of false positives. Performing a comprehensive assay of all markers in a large, well-powered dataset appears to be the most effective strategy for complex disease gene identification.
doi:10.1186/1471-2156-6-S1-S66
PMCID: PMC1866716  PMID: 16451679
17.  New methods for finding disease-susceptibility genes: impact and potential 
Genome Biology  2003;4(10):119.
Improved techniques for defining disease-gene location and evaluating the biological candidacy of regional transcripts will hasten disease-gene discovery.
Improved techniques for defining disease-gene location and evaluating the biological candidacy of regional transcripts will hasten disease-gene discovery.
PMCID: PMC328443  PMID: 14519189
18.  Bayesian refinement of association signals for 14 loci in 3 common diseases 
Nature genetics  2012;44(12):1294-1301.
To further investigate susceptibility loci identified by genome-wide association studies, we genotyped 5,500 SNPs across 14 associated regions in 8,000 samples from a control group and 3 diseases: type 2 diabetes (T2D), coronary artery disease (CAD) and Graves’ disease. We defined, using Bayes theorem, credible sets of SNPs that were 95% likely, based on posterior probability, to contain the causal disease-associated SNPs. In 3 of the 14 regions, TCF7L2 (T2D), CTLA4 (Graves’ disease) and CDKN2A-CDKN2B (T2D), much of the posterior probability rested on a single SNP, and, in 4 other regions (CDKN2A-CDKN2B (CAD) and CDKAL1, FTO and HHEX (T2D)), the 95% sets were small, thereby excluding most SNPs as potentially causal. Very few SNPs in our credible sets had annotated functions, illustrating the limitations in understanding the mechanisms underlying susceptibility to common diseases. Our results also show the value of more detailed mapping to target sequences for functional studies.
doi:10.1038/ng.2435
PMCID: PMC3791416  PMID: 23104008
19.  A Powerful Approach to Sub-Phenotype Analysis in Population-Based Genetic Association Studies 
Genetic Epidemiology  2009;34(4):335-343.
The ultimate goal of genome-wide association (GWA) studies is to identify genetic variants contributing effects to complex phenotypes in order to improve our understanding of the biological architecture underlying the trait. One approach to allow us to meet this challenge is to consider more refined sub-phenotypes of disease, defined by pattern of symptoms, for example, which may be physiologically distinct, and thus may have different underlying genetic causes. The disadvantage of sub-phenotype analysis is that large disease cohorts are sub-divided into smaller case categories, thus reducing power to detect association. To address this issue, we have developed a novel test of association within a multinomial regression modeling framework, allowing for heterogeneity of genetic effects between sub-phenotypes. The modeling framework is extremely flexible, and can be generalized to any number of distinct sub-phenotypes. Simulations demonstrate the power of the multinomial regression-based analysis over existing methods when genetic effects differ between sub-phenotypes, with minimal loss of power when these effects are homogenous for the unified phenotype. Application of the multinomial regression analysis to a genome-wide association study of type 2 diabetes, with cases categorized according to body mass index, highlights previously recognized differential mechanisms underlying obese and non-obese forms of the disease, and provides evidence of a potential novel association that warrants follow-up in independent replication cohorts.
doi:10.1002/gepi.20486
PMCID: PMC2964510  PMID: 20039379
multinomial regression; sub-phenotype analysis; genome-wide association study; type 2 diabetes; obesity
20.  Rapid Testing of Gene-Gene Interactions in Genome-Wide Association Studies of Binary and Quantitative Phenotypes 
Genetic Epidemiology  2011;35(8):800-808.
Genome-wide association (GWA) studies have been extremely successful in identifying novel loci contributing effects to a wide range of complex human traits. However, despite this success, the joint marginal effects of these loci account for only a small proportion of the heritability of these traits. Interactions between variants in different loci are not typically modelled in traditional GWA analysis, but may account for some of the missing heritability in humans, as they do in other model organisms. One of the key challenges in performing gene-gene interaction studies is the computational burden of the analysis. We propose a two-stage interaction analysis strategy to address this challenge in the context of both quantitative traits and dichotomous phenotypes. We have performed simulations to demonstrate only a negligible loss in power of this two-stage strategy, while minimizing the computational burden. Application of this interaction strategy to GWA studies of T2D and obesity highlights potential novel signals of association, which warrant follow-up in larger cohorts. Genet. Epidemiol. 2011.© 2011 Wiley Periodicals, Inc.35: 800-808, 2011
doi:10.1002/gepi.20629
PMCID: PMC3410530  PMID: 21948692
genome-wide association study; gene-gene interaction; computational efficiency
21.  Genome-wide association study identifies multiple loci influencing human serum metabolite levels 
Nature genetics  2012;44(3):269-276.
Nuclear magnetic resonance assays allow for measurement of a wide range of metabolic phenotypes. We report here the results of a GWAS on 8,330 Finnish individuals genotyped and imputed at 7.7 million SNPs for a range of 216 serum metabolic phenotypes assessed by NMR of serum samples. We identified significant associations (P < 2.31 × 10−10) at 31 loci, including 11 for which there have not been previous reports of associations to a metabolic trait or disorder. Analyses of Finnish twin pairs suggested that the metabolic measures reported here show higher heritability than comparable conventional metabolic phenotypes. In accordance with our expectations, SNPs at the 31 loci associated with individual metabolites account for a greater proportion of the genetic component of trait variance (up to 40%) than is typically observed for conventional serum metabolic phenotypes. The identification of such associations may provide substantial insight into cardiometabolic disorders.
doi:10.1038/ng.1073
PMCID: PMC3605033  PMID: 22286219
22.  Genomic inflation factors under polygenic inheritance 
Population structure, including population stratification and cryptic relatedness, can cause spurious associations in genome-wide association studies (GWAS). Usually, the scaled median or mean test statistic for association calculated from multiple single-nucleotide-polymorphisms across the genome is used to assess such effects, and ‘genomic control' can be applied subsequently to adjust test statistics at individual loci by a genomic inflation factor. Published GWAS have clearly shown that there are many loci underlying genetic variation for a wide range of complex diseases and traits, implying that a substantial proportion of the genome should show inflation of the test statistic. Here, we show by theory, simulation and analysis of data that in the absence of population structure and other technical artefacts, but in the presence of polygenic inheritance, substantial genomic inflation is expected. Its magnitude depends on sample size, heritability, linkage disequilibrium structure and the number of causal variants. Our predictions are consistent with empirical observations on height in independent samples of ∼4000 and ∼133 000 individuals.
doi:10.1038/ejhg.2011.39
PMCID: PMC3137506  PMID: 21407268
genome-wide association study; genomic inflation factor; polygenic inheritance
23.  Human metabolic profiles are stably controlled by genetic and environmental variation 
A comprehensive variation map of the human metabolome identifies genetic and stable-environmental sources as major drivers of metabolite concentrations. The data suggest that sample sizes of a few thousand are sufficient to detect metabolite biomarkers predictive of disease.
We designed a longitudinal twin study to characterize the genetic, stable-environmental, and longitudinally fluctuating influences on metabolite concentrations in two human biofluids—urine and plasma—focusing specifically on the representative subset of metabolites detectable by 1H nuclear magnetic resonance (1H NMR) spectroscopy.We identified widespread genetic and stable-environmental influences on the (urine and plasma) metabolomes, with (30 and 42%) attributable on average to familial sources, and (47 and 60%) attributable to longitudinally stable sources.Ten of the metabolites annotated in the study are estimated to have >60% familial contribution to their variation in concentration.Our findings have implications for the design and interpretation of 1H NMR-based molecular epidemiology studies. On the basis of the stable component of variation quantified in the current paper, we specified a model of disease association under which we inferred that sample sizes of a few thousand should be sufficient to detect disease-predictive metabolite biomarkers.
Metabolites are small molecules involved in biochemical processes in living systems. Their concentration in biofluids, such as urine and plasma, can offer insights into the functional status of biological pathways within an organism, and reflect input from multiple levels of biological organization—genetic, epigenetic, transcriptomic, and proteomic—as well as from environmental and lifestyle factors. Metabolite levels have the potential to indicate a broad variety of deviations from the ‘normal' physiological state, such as those that accompany a disease, or an increased susceptibility to disease. A number of recent studies have demonstrated that metabolite concentrations can be used to diagnose disease states accurately. A more ambitious goal is to identify metabolite biomarkers that are predictive of future disease onset, providing the possibility of intervention in susceptible individuals.
If an extreme concentration of a metabolite is to serve as an indicator of disease status, it is usually important to know the distribution of metabolite levels among healthy individuals. It is also useful to characterize the sources of that observed variation in the healthy population. A proportion of that variation—the heritable component—is attributable to genetic differences between individuals, potentially at many genetic loci. An effective, molecular indicator of a heritable, complex disease is likely to have a substantive heritable component. Non-heritable biological variation in metabolite concentrations can arise from a variety of environmental influences, such as dietary intake, lifestyle choices, general physical condition, composition of gut microflora, and use of medication. Variation across a population in stable-environmental influences leads to long-term differences between individuals in their baseline metabolite levels. Dynamic environmental pressures lead to short-term fluctuations within an individual about their baseline level. A metabolite whose concentration changes substantially in response to short-term pressures is relatively unlikely to offer long-term prediction of disease. In summary, the potential suitability of a metabolite to predict disease is reflected by the relative contributions of heritable and stable/unstable-environmental factors to its variation in concentration across the healthy population.
Studies involving twins are an established technique for quantifying the heritable component of phenotypes in human populations. Monozygotic (MZ) twins share the same DNA genome-wide, while dizygotic (DZ) twins share approximately half their inherited DNA, as do ordinary siblings. By comparing the average extent of phenotypic concordance within MZ pairs to that within DZ pairs, it is possible to quantify the heritability of a trait, and also to quantify the familiality, which refers to the combination of heritable and common-environmental effects (i.e., environmental influences shared by twins in a pair). In addition to incorporating twins into the study design, it is useful to quantify the phenotype in some individuals at multiple time points. The longitudinal aspect of such a study allows environmental effects to be decomposed into those that affect the phenotype over the short term and those that exert stable influence.
For the current study, urine and blood samples were collected from a cohort of MZ and DZ twins, with some twins donating samples on two occasions several months apart. Samples were analysed by 1H nuclear magnetic resonance (1H NMR) spectroscopy—an untargeted, discovery-driven technique for quantifying metabolite concentrations in biological samples. The application of 1H NMR to a biological sample creates a spectrum, made up of multiple peaks, with each peak's size quantitatively representing the concentration of its corresponding hydrogen-containing metabolite.
In each biological sample in our study, we extracted a full set of peaks, and thereby quantified the concentrations of all common plasma and urine metabolites detectable by 1H NMR. We developed bespoke statistical methods to decompose the observed concentration variation at each metabolite peak into that originating from familial, individual-environmental, and unstable-environmental sources.
We quantified the variability landscape across all common metabolite peaks in the urine and plasma 1H NMR metabolomes. We annotated a subset of peaks with a total of 65 metabolites; the variance decompositions for these are shown in Figure 1. Ten metabolites' concentrations were estimated to have familial contributions in excess of 60%. The average proportion of stable variation across all extracted metabolite peaks was estimated to be 47% in the urine samples and 60% in the plasma samples; the average estimated familiality was 30% for urine and 42% for plasma. These results comprise the first quantitative variation map of the 1H NMR metabolome. The identification and quantification of substantive widespread stability provides support for the use of these biofluids in molecular epidemiology studies. On the basis of our findings, we performed power calculations for a hypothetical study searching for predictive disease biomarkers among 1H NMR-detectable urine and plasma metabolites. Our calculations suggest that sample sizes of 2000–5000 should allow reliable identification of disease-predictive metabolite concentrations explaining 5–10% of disease risk, while greater sample sizes of 5000–20 000 would be required to identify metabolite concentrations explaining 1–2% of disease risk.
1H Nuclear Magnetic Resonance spectroscopy (1H NMR) is increasingly used to measure metabolite concentrations in sets of biological samples for top-down systems biology and molecular epidemiology. For such purposes, knowledge of the sources of human variation in metabolite concentrations is valuable, but currently sparse. We conducted and analysed a study to create such a resource. In our unique design, identical and non-identical twin pairs donated plasma and urine samples longitudinally. We acquired 1H NMR spectra on the samples, and statistically decomposed variation in metabolite concentration into familial (genetic and common-environmental), individual-environmental, and longitudinally unstable components. We estimate that stable variation, comprising familial and individual-environmental factors, accounts on average for 60% (plasma) and 47% (urine) of biological variation in 1H NMR-detectable metabolite concentrations. Clinically predictive metabolic variation is likely nested within this stable component, so our results have implications for the effective design of biomarker-discovery studies. We provide a power-calculation method which reveals that sample sizes of a few thousand should offer sufficient statistical precision to detect 1H NMR-based biomarkers quantifying predisposition to disease.
doi:10.1038/msb.2011.57
PMCID: PMC3202796  PMID: 21878913
biomarker; 1H nuclear magnetic resonance spectroscopy; metabolome-wide association study; top-down systems biology; variance decomposition
24.  A common variant of HMGA2 is associated with adult and childhood height in the general population 
Nature genetics  2007;39(10):1245-1250.
Human height is a classic, highly heritable quantitative trait. To begin to identify genetic variants influencing height, we examined genome-wide association data from 4,921 individuals. Common variants in the HMGA2 oncogene, exemplified by rs1042725, were associated with height (P = 4 × 10−8). HMGA2 is also a strong biological candidate for height, as rare, severe mutations in this gene alter body size in mice and humans, so we tested rs1042725 in additional samples. We confirmed the association in 19,064 adults from four further studies (P = 3 × 10−11, overall P = 4 × 10−16, including the genome-wide association data). We also observed the association in children (P = 1 × 10−6, N = 6,827) and a tall/short case-control study (P = 4 × 10−6, N = 3,207). We estimate that rs1042725 explains ~0.3% of population variation in height (~0.4 cm increased adult height per C allele). There are few examples of common genetic variants reproducibly associated with human quantitative traits; these results represent, to our knowledge, the first consistently replicated association with adult and childhood height.
doi:10.1038/ng2121
PMCID: PMC3086278  PMID: 17767157
25.  Association of FTO variants with BMI and fat mass in the self-contained population of Sorbs in Germany 
The association between common variants in the FTO gene with weight, adiposity and body mass index (BMI) has now been widely replicated. Although the causal variant has yet to be identified, it most likely maps within a 47 kb region of intron 1 of FTO. We performed a genome-wide association study in the Sorbian population and evaluated the relationships between FTO variants and BMI and fat mass in this isolate of Slavonic origin resident in Germany. In a sample of 948 Sorbs, we could replicate the earlier reported associations of intron 1 SNPs with BMI (eg, P-value=0.003, β=0.02 for rs8050136). However, using genome-wide association data, we also detected a second independent signal mapping to a region in intron 2/3 about 40–60 kb away from the originally reported SNPs (eg, for rs17818902 association with BMI P-value=0.0006, β=−0.03 and with fat mass P-value=0.0018, β=−0.079). Both signals remain independently associated in the conditioned analyses. In conclusion, we extend the evidence that FTO variants are associated with BMI by putatively identifying a second susceptibility allele independent of that described earlier. Although further statistical analysis of these findings is hampered by the finite size of the Sorbian isolate, these findings should encourage other groups to seek alternative susceptibility variants within FTO (and other established susceptibility loci) using the opportunities afforded by analyses in populations with divergent mutational and/or demographic histories.
doi:10.1038/ejhg.2009.107
PMCID: PMC2987177  PMID: 19584900
FTO; BMI; Sorbs

Results 1-25 (34)