The promise of personalized genomics for common complex diseases depends, in part, on the ability to predict genetic risks on the basis of single nucleotide polymorphisms. We examined and compared the methods of three companies (23andMe, deCODEme, and Navigenics) that have offered direct-to-consumer personal genome testing.
We simulated genotype data for 100,000 individuals on the basis of published genotype frequencies and predicted disease risks using the methods of the companies. Predictive ability for six diseases was assessed by the AUC.
AUC values differed among the diseases and among the companies. The highest values of the AUC were observed for age related macular degeneration, celiac disease, and Crohn disease. The largest difference among the companies was found for celiac disease: the AUC was 0.73 for 23andMe and 0.82 for deCODEme. Predicted risks differed substantially among the companies as a result of differences in the sets of single nucleotide polymorphisms selected and the average population risks selected by the companies, and in the formulas used for the calculation of risks.
Future efforts to design predictive models for the genomics of common complex diseases may benefit from understanding the strengths and limitations of the predictive algorithms designed by these early companies.
The discrimination of a risk prediction model measures that model's ability to distinguish between subjects with and without events. The area under the receiver operating characteristic curve (AUC) is a popular measure of discrimination. However, the AUC has recently been criticized for its insensitivity in model comparisons in which the baseline model has performed well. Thus, 2 other measures have been proposed to capture improvement in discrimination for nested models: the integrated discrimination improvement and the continuous net reclassification improvement. In the present study, the authors use mathematical relations and numerical simulations to quantify the improvement in discrimination offered by candidate markers of different strengths as measured by their effect sizes. They demonstrate that the increase in the AUC depends on the strength of the baseline model, which is true to a lesser degree for the integrated discrimination improvement. On the other hand, the continuous net reclassification improvement depends only on the effect size of the candidate variable and its correlation with other predictors. These measures are illustrated using the Framingham model for incident atrial fibrillation. The authors conclude that the increase in the AUC, integrated discrimination improvement, and net reclassification improvement offer complementary information and thus recommend reporting all 3 alongside measures characterizing the performance of the final model.
area under curve; biomarkers; discrimination; risk assessment; risk factors
In recent years, developments in genomics technologies have led to the rise of commercial personal genome testing (PGT): broad genome-wide testing for multiple diseases simultaneously. While some commercial providers require physicians to order a personal genome test, others can be accessed directly. All providers advertise directly to consumers and offer genetic risk information about dozens of diseases in one single purchase. The quantity and the complexity of risk information pose challenges to adequate pre-test and post-test information provision and informed consent. There are currently no guidelines for what should constitute informed consent in PGT or how adequate informed consent can be achieved. In this paper, we propose a tiered-layered-staged model for informed consent. First, the proposed model is tiered as it offers choices between categories of diseases that are associated with distinct ethical, personal or societal issues. Second, the model distinguishes layers of information with a first layer offering minimal, indispensable information that is material to all consumers, and additional layers offering more detailed information made available upon request. Finally, the model stages informed consent as a process by feeding information to consumers in each subsequent stage of the process of undergoing a test, and by accommodating renewed consent for test result updates, resulting from the ongoing development of the science underlying PGT. A tiered-layered-staged model for informed consent with a focus on the consumer perspective can help overcome the ethical problems of information provision and informed consent in direct-to-consumer PGT.
personal genome testing; informed consent; ethical issues; complex diseases
The objective of this paper is to assess parental beliefs and intentions about genetic testing for their children in a multi-ethnic population with the aim of acquiring information to guide interventions for obesity prevention and management. A cross-sectional survey was conducted in parents of native Dutch children and children from a large minority population (Turks) selected from Youth Health Care registries. The age range of the children was 5–11 years. Parents with lower levels of education and parents of non-native children were more convinced that overweight has a genetic cause and their intentions to test the genetic predisposition of their child to overweight were firmer. A firmer intention to test the child was associated with the parents’ perceptions of their child’s susceptibility to being overweight, a positive attitude towards genetic testing, and anticipated regret at not having the child tested while at risk for overweight. Interaction effects were found in ethnic and socio-economic groups. Ethnicity and educational level play a role in parental beliefs about child overweight and genetic testing. Education programmes about obesity risk, genetic testing and the importance of behaviour change should be tailored to the cultural and behavioural factors relevant to ethnic and socio-economic target groups.
Genetics; Attitude; Health promotion; Obesity; Child
Hypertension is an important determinant of cardiovascular morbidity and mortality and has a substantial heritability, which is likely of polygenic origin. The aim of this study was to assess to what extent multiple common genetic variants contribute to blood pressure regulation in both adults and children, and to assess overlap in variants between different age groups, using genome wide profiling. SNP sets were defined based on a meta-analysis of genome-wide association studies on systolic (SBP) and diastolic blood pressure (DBP) performed by the Cohort for Heart and Aging Research in Genome Epidemiology (CHARGE, n=29,136), using different P-value thresholds for selecting single nucleotide polymorphisms (SNPs). Subsequently, genetic risk scores for SBP and DBP were calculated in an independent adult population (n=2,072) and a child population (n=1,034). The explained variance of the genetic risk scores was evaluated using linear regression models, including sex, age and body mass index. Genetic risk scores, including also many non-genome-wide significant SNPs explained more of the variance than scores based only on very significant SNPs in adults and children. Genetic risk scores significantly explained up to 1.2% (P=9.6*10−8) of the variance in adult SBP and 0.8% (P=0.004) in children. For DBP, the variance explained was similar in adults and children (1.7% (P=8.9*10−10) and 1.4% (P=3.3*10−5) respectively). These findings suggest the presence of many genetic loci with small effects on blood pressure regulation both in adults and children, indicating also a (partly) common polygenic regulation of blood pressure throughout different periods of life.
genome-wide association; genome-wide profiling; genetic risk scores; blood pressure; hypertension
An Essay by A. Cecile Janssens and Peter Kraft discusses the limitations inherent in research involving collection of self-reported data by self-selected participants, and makes proposals for upfront communication of such limitations to study participants.
The rapid and continuing progress in gene discovery for complex diseases is fuelling interest in the potential application of genetic risk models for clinical and public health practice.The number of studies assessing the predictive ability is steadily increasing, but they vary widely in completeness of reporting and apparent quality.Transparent reporting of the strengths and weaknesses of these studies is important to facilitate the accumulation of evidence on genetic risk prediction.A multidisciplinary workshop sponsored by the Human Genome Epidemiology Network developed a checklist of 25 items recommended for strengthening the reporting of Genetic RIsk Prediction Studies (GRIPS), building on the principles established by prior reporting guidelines.These recommendations aim to enhance the transparency, quality and completeness of study reporting, and thereby to improve the synthesis and application of information from multiple studies that might differ in design, conduct or analysis.
The quality and quantity of food intake affect body weight, but little is known about the genetics of such human dietary intake patterns in relation to the genetics of BMI. We aimed to estimate the heritability of dietary intake patterns and genetic correlation with BMI in participants of the Erasmus Rucphen Family study. The study included 1,690 individuals (42 % men; age range, 19–92), of whom 41.4 % were overweight and 15.9 % were obese. Self-report questionnaires were used to assess the number of days (0–7) on which participants consumed vegetables, fruit, fruit juice, fish, unhealthy snacks, fastfood, and soft drinks. Principal component analysis was applied to examine the correlations between the questionnaire items and to generate dietary intake pattern scores. Heritability and the shared genetic and shared non-genetic (environmental) correlations were estimated using the family structure of the cohort. Principal component analysis suggested that the questionnaire items could be grouped in a healthy and unhealthy dietary intake pattern, explaining 22 and 18 % of the phenotypic variance, respectively. The dietary intake patterns had a heritability of 0.32 for the healthy and 0.27 for the unhealthy pattern. Genetic correlations between the dietary intake patterns and BMI were not significant, but we found a significant environmental correlation between the unhealthy dietary intake pattern and BMI. Specific dietary intake patterns are associated with the risk of obesity and are heritable traits. The genetic factors that determine specific dietary intake patterns do not significantly overlap with the genetic factors that determine BMI.
Electronic supplementary material
The online version of this article (doi:10.1007/s00592-012-0387-0) contains supplementary material, which is available to authorized users.
Heritability; BMI; Food intake
A recent collaborative genome-wide association study replicated a large number of susceptibility loci and identified novel loci. This increase in known multiple sclerosis (MS) risk genes raises questions about clinical applicability of genotyping. In an empirical set we assessed the predictive power of typing multiple genes. Next, in a modelling study we explored current and potential predictive performance of genetic MS risk models.
Materials and Methods
Genotype data on 6 MS risk genes in 591 MS patients and 600 controls were used to investigate the predictive value of combining risk alleles. Next, the replicated and novel MS risk loci from the recent and largest international genome-wide association study were used to construct genetic risk models simulating a population of 100,000 individuals. Finally, we assessed the required numbers, frequencies, and ORs of risk SNPs for higher discriminative accuracy in the future.
Individuals with 10 to 12 risk alleles had a significantly increased risk compared to individuals with the average population risk for developing MS (OR 2.76 (95% CI 2.02–3.77)). In the simulation study we showed that the area under the receiver operating characteristic curve (AUC) for a risk score based on the 6 SNPs was 0.64. The AUC increases to 0.66 using the well replicated 24 SNPs and to 0.69 when including all replicated and novel SNPs (n = 53) in the risk model. An additional 20 SNPs with allele frequency 0.30 and ORs 1.1 would be needed to increase the AUC to a slightly higher level of 0.70, and at least 50 novel variants with allele frequency 0.30 and ORs 1.4 would be needed to obtain an AUC of 0.85.
Although new MS risk SNPs emerge rapidly, the discriminatory ability in a clinical setting will be limited.
Fueled by the successes of genome-wide association studies, numerous studies have investigated the predictive ability of genetic risk models in type 2 diabetes. In this paper, we review these studies from a methodological perspective, focusing on the variables included in the risk models as well as the study designs and populations investigated. We argue and show that differences in study design and characteristics of the study population have an impact on the observed predictive ability of risk models. This observation emphasizes that genetic risk prediction studies should be conducted in those populations in which the prediction models will ultimately be applied, if proven useful. Of all genetic risk prediction studies to date, only a few were conducted in populations that might be relevant for targeting preventive interventions.
Genetic predisposition; Risk prediction; Type 2 diabetes; Public health; Risk factors; Prevention
The rapid identification of genetic markers for multifactorial diseases from genome-wide association studies is fuelling interest in investigating the predictive ability and health care utility of genetic risk models. Various measures are available for the assessment of risk prediction models, each addressing a different aspect of performance and utility. We developed PredictABEL, a package in R that covers descriptive tables, measures and figures that are used in the analysis of risk prediction studies such as measures of model fit, predictive ability and clinical utility, and risk distributions, calibration plot and the receiver operating characteristic plot. Tables and figures are saved as separate files in a user-specified format, which include publication-quality EPS and TIFF formats. All figures are available in a ready-made layout, but they can be customized to the preferences of the user. The package has been developed for the analysis of genetic risk prediction studies, but can also be used for studies that only include non-genetic risk factors. PredictABEL is freely available at the websites of GenABEL (http://www.genabel.org) and CRAN (http://cran.r-project.org/).
Risk prediction; Genetic; Assessment; Measures; Software
The rapid and continuing progress in gene discovery for complex diseases is fueling interest in the potential application of genetic risk models for clinical and public health practice. The number of studies assessing the predictive ability is steadily increasing, but the quality and completeness of reporting varies. A multidisciplinary workshop sponsored by the Human Genome Epidemiology Network developed a checklist of 25 items recommended for strengthening the reporting of Genetic RIsk Prediction Studies (GRIPS), building on the principles established by prior reporting guidelines. These recommendations aim to enhance the transparency of study reporting, and thereby to improve the synthesis and application of information from multiple studies that might differ in design, conduct, or analysis. A detailed Explanation and Elaboration document is published.
Genetic; Risk prediction; Methodology; Guidelines; Reporting
The rapid and continuing progress in gene discovery for complex diseases is fuelling interest in the potential application of genetic risk models for clinical and public health practice. The number of studies assessing the predictive ability is steadily increasing, but they vary widely in completeness of reporting and apparent quality. Transparent reporting of the strengths and weaknesses of these studies is important to facilitate the accumulation of evidence on genetic risk prediction. A multidisciplinary workshop sponsored by the Human Genome Epidemiology Network developed a checklist of 25 items recommended for strengthening the reporting of Genetic RIsk Prediction Studies (GRIPS), building on the principles established by prior reporting guidelines. These recommendations aim to enhance the transparency, quality and completeness of study reporting, and thereby to improve the synthesis and application of information from multiple studies that might differ in design, conduct or analysis.
Genetic; Risk prediction; Methodology; Guidelines; Reporting
Cecile Janssens and colleagues present the GRIPS Statement, a checklist to help strengthen the reporting of genetic risk prediction studies.
The increasing availability of personal genomic tests has led to discussions about the validity and utility of such tests and the balance of benefits and harms. A multidisciplinary workshop was convened by the National Institutes of Health and the Centers for Disease Control and Prevention to review the scientific foundation for using personal genomics in risk assessment and disease prevention and to develop recommendations for targeted research. The clinical validity and utility of personal genomics is a moving target with rapidly developing discoveries but little translation research to close the gap between discoveries and health impact. Workshop participants made recommendations in five domains: (1) developing and applying scientific standards for assessing personal genomic tests; (2) developing and applying a multidisciplinary research agenda, including observational studies and clinical trials to fill knowledge gaps in clinical validity and utility; (3) enhancing credible knowledge synthesis and information dissemination to clinicians and consumers; (4) linking scientific findings to evidence-based recommendations for use of personal genomics; and (5) assessing how the concept of personal utility can affect health benefits, costs, and risks by developing appropriate metrics for evaluation. To fulfill the promise of personal genomics, a rigorous multidisciplinary research agenda is needed.
behavioral sciences; epidemiologic methods; evidence-based medicine; genetics; genetic testing; genomics; medicine; public health
The purpose of the present study is to examine HFE gene mutations in relation to newly diagnosed (incident) coronary heart disease (CHD). In a population-based follow-up study of 7,983 individuals aged 55 years and older, we compared the risk of incident CHD between HFE carriers and non-carriers, overall and stratified by sex and smoking status. HFE mutations were significantly associated with an increased risk of incident CHD in women but not in men (hazard ratio [HR] for women = 1.7, 95% confidence interval [CI] 1.2–2.4 versus HR for men = 0.9, 95% CI 0.7–1.2). This increased CHD risk associated with HFE mutations in women was statistically significant in never smokers (HR = 1.8, 95% CI 1.1–2.8) and current smokers (HR = 3.1, 95% CI 1.4–7.1), but not in former smokers (HR = 1.3, 95% CI 0.7–2.4). HFE mutations are associated with increased risk of incident CHD in women.
HFE mutation; Hemochromatosis; Coronary heart disease; Smoking; Gender
Personality traits are summarized by five broad dimensions with pervasive influences on major life outcomes, strong links to psychiatric disorders, and clear heritable components. To identify genetic variants associated with each of the five dimensions of personality we performed a genome wide association (GWA) scan of 3,972 individuals from a genetically isolated population within Sardinia, Italy. Based on analyses of 362,129 single nucleotide polymorphisms (SNPs) we found several strong signals within or near genes previously implicated in psychiatric disorders. They include the association of Neuroticism with SNAP25 (rs362584, P = 5 × 10−5), Extraversion with BDNF and two cadherin genes (CDH13 and CDH23; Ps < 5 × 10−5), Openness with CNTNAP2 (rs10251794, P = 3 × 10−5), Agreeableness with CLOCK (rs6832769, P = 9 × 10−6), and Conscientiousness with DYRK1A (rs2835731, P = 3 × 10−5). Effect sizes were small (less than 1% of variance), and most failed to replicate in the follow-up independent samples (N up to 3,903), though the association between Agreeableness and CLOCK was supported in two of three replication samples (overall P = 2 × 10−5). We infer that a large number of loci may influence personality traits and disorders, requiring larger sample sizes for the GWA approach to identify significant genetic variants.
personality; genome wide association; founder population; psychiatry; five-factor model
To assess the potential effectiveness of communicating familial risk of diabetes on illness perceptions and self-reported behavioral outcomes.
RESEARCH DESIGN AND METHODS
Individuals with a family history of diabetes were randomized to receive risk information based on familial and general risk factors (n = 59) or general risk factors alone (n = 59). Outcomes were assessed using questionnaires at baseline, 1 week, and 3 months.
Compared with individuals receiving general risk information, those receiving familial risk information perceived heredity to be a more important cause of diabetes (P < 0.01) at 1-week follow-up, perceived greater control over preventing diabetes (P < 0.05), and reported having eaten more healthily (P = 0.01) after 3 months. Behavioral intentions did not differ between the groups.
Communicating familial risk increased personal control and, thus, did not result in fatalism. Although the intervention did not influence intentions to change behavior, there was some evidence to suggest it increases healthy behavior.
OBJECTIVE—Prediction of type 2 diabetes based on genetic testing might improve identification of high-risk subjects. Genome-wide association (GWA) studies identified multiple new genetic variants that associate with type 2 diabetes. The predictive value of genetic testing for prediction of type 2 diabetes in the general population is unclear.
RESEARCH DESIGN AND METHODS—We investigated 18 polymorphisms from recent GWA studies on type 2 diabetes in the Rotterdam Study, a prospective, population-based study among homogeneous Caucasian individuals of 55 years and older (genotyped subjects, n = 6,544; prevalent cases, n = 686; incident cases during follow-up, n = 601; mean follow-up 10.6 years). The predictive value of these polymorphisms was examined alone and in addition to clinical characteristics using logistic and Cox regression analyses. The discriminative accuracy of the prediction models was assessed by the area under the receiver operating characteristic curves (AUCs).
RESULTS—Of the 18 polymorphisms, the ADAMTS9, CDKAL1, CDKN2A/B-rs1412829, FTO, IGF2BP2, JAZF1, SLC30A8, TCF7L2, and WFS1 variants were associated with type 2 diabetes risk in our population. The AUC was 0.60 (95% CI 0.57–0.63) for prediction based on the genetic polymorphisms; 0.66 (0.63–0.68) for age, sex, and BMI; and 0.68 (0.66–0.71) for the genetic polymorphisms and clinical characteristics combined.
CONCLUSIONS—We showed that 9 of 18 well-established genetic risk variants were associated with type 2 diabetes in a population-based study. Combining genetic variants has low predictive value for future type 2 diabetes at a population-based level. The genetic polymorphisms only marginally improved the prediction of type 2 diabetes beyond clinical characteristics.
In the Victorian era, Sir Francis Galton showed that ‘when dealing with the transmission of stature from parents to children, the average height of the two parents, … is all we need care to know about them' (1886). One hundred and twenty-two years after Galton's work was published, 54 loci showing strong statistical evidence for association to human height were described, providing us with potential genomic means of human height prediction. In a population-based study of 5748 people, we find that a 54-loci genomic profile explained 4–6% of the sex- and age-adjusted height variance, and had limited ability to discriminate tall/short people, as characterized by the area under the receiver-operating characteristic curve (AUC). In a family-based study of 550 people, with both parents having height measurements, we find that the Galtonian mid-parental prediction method explained 40% of the sex- and age-adjusted height variance, and showed high discriminative accuracy. We have also explored how much variance a genomic profile should explain to reach certain AUC values. For highly heritable traits such as height, we conclude that in applications in which parental phenotypic information is available (eg, medicine), the Victorian Galton's method will long stay unsurpassed, in terms of both discriminative accuracy and costs. For less heritable traits, and in situations in which parental information is not available (eg, forensics), genomic methods may provide an alternative, given that the variants determining an essential proportion of the trait's variation can be identified.
height; heritability; prediction; genomic profiling; discriminative accuracy; area under the receiver-operating characteristic curve (AUC)
Recent genome-wide association (GWA) studies of lipids have been conducted in samples ascertained for other phenotypes, particularly diabetes. Here we report the first GWA analysis of loci affecting total cholesterol (TC), low-density lipoprotein (LDL) cholesterol, high-density lipoprotein (HDL) cholesterol and triglycerides sampled randomly from 16 population-based cohorts and genotyped using mainly the Illumina HumanHap300-Duo platform. Our study included a total of 17,797-22,562 persons, aged 18-104 years and from geographic regions spanning from the Nordic countries to Southern Europe. We established 22 loci associated with serum lipid levels at a genome-wide significance level (P < 5 × 10-8), including 16 loci that were identified by previous GWA studies. The six newly identified loci in our cohort samples are ABCG5 (TC, P = 1.5 × 10-11; LDL, P = 2.6 × 10-10), TMEM57 (TC, P = 5.4 × 10-10), CTCF-PRMT8 region (HDL, P = 8.3 × 10-16), DNAH11 (LDL, P = 6.1 × 10-9), FADS3-FADS2 (TC, P = 1.5 × 10-10; LDL, P = 4.4 × 10-13) and MADD-FOLH1 region (HDL, P = 6 × 10-11). For three loci, effect sizes differed significantly by sex. Genetic risk scores based on lipid loci explain up to 4.8% of variation in lipids and were also associated with increased intima media thickness (P = 0.001) and coronary heart disease incidence (P = 0.04). The genetic risk score improves the screening of high-risk groups of dyslipidemia over classical risk factors.
Genome-wide association studies (GWAS) have led to a rapid increase in available data on common genetic variants and phenotypes and numerous discoveries of new loci associated with susceptibility to common complex diseases. Integrating the evidence from GWAS and candidate gene studies depends on concerted efforts in data production, online publication, database development, and continuously updated data synthesis. Here the authors summarize current experience and challenges on these fronts, which were discussed at a 2008 multidisciplinary workshop sponsored by the Human Genome Epidemiology Network. Comprehensive field synopses that integrate many reported gene-disease associations have been systematically developed for several fields, including Alzheimer's disease, schizophrenia, bladder cancer, coronary heart disease, preterm birth, and DNA repair genes in various cancers. The authors summarize insights from these field synopses and discuss remaining unresolved issues—especially in the light of evidence from GWAS, for which they summarize empirical P-value and effect-size data on 223 discovered associations for binary outcomes (142 with P < 10−7). They also present a vision of collaboration that builds reliable cumulative evidence for genetic associations with common complex diseases and a transparent, distributed, authoritative knowledge base on genetic variation and human health. As a next step in the evolution of Human Genome Epidemiology reviews, the authors invite investigators to submit field synopses for possible publication in the American Journal of Epidemiology.
association; database; encyclopedias; epidemiologic methods; genome, human; genome-wide association study; genomics; meta-analysis
To investigate the extent to which shared genetic factors can explain the clustering of depression among individuals with lower socioeconomic status, and to examine if neuroticism or intelligence are involved in these pathways.
In total 2,383 participants (1,028 men and 1,355 women) of the Erasmus Rucphen Family Study were assessed with the Center for Epidemiologic Studies Depression Scale (CES-D) and the Hospital Anxiety and Depression Scale (HADS-D). Socioeconomic status was assessed as the highest level of education obtained. The role of shared genetic factors was quantified by estimating genetic correlations (ρG) between symptoms of depression and education level, with and without adjustment for premorbid intelligence and neuroticism scores.
Higher level of education was associated with lower depression scores (partial correlation coefficient −0.09 for CES-D and −0.17 for HADS-D). Significant genetic correlations were found between education and both CES-D (ρG = −0.65) and HADS-D (ρG = −0.50). The genetic correlations remained statistically significant after adjusting for premorbid intelligence and neuroticism scores.
Our study suggests that shared genetic factors play a role in the co-occurrence of lower socioeconomic status and symptoms of depression, which suggest that genetic factors play a role in health inequalities. Further research is needed to investigate the validity, causality and generalizability of our results.