Background: There is increasing interest in investigating genetic risk models in empirical studies, but such studies are premature when the expected predictive ability of the risk model is low. We assessed how accurately the predictive ability of genetic risk models can be estimated in simulated data that are created based on the odds ratios (ORs) and frequencies of single-nucleotide polymorphisms (SNPs) obtained from genome-wide association studies (GWASs).
Methods: We aimed to replicate published prediction studies that reported the area under the receiver operating characteristic curve (AUC) as a measure of predictive ability. We searched GWAS articles for all SNPs included in these models and extracted ORs and risk allele frequencies to construct genotypes and disease status for a hypothetical population. Using these hypothetical data, we reconstructed the published genetic risk models and compared their AUC values to those reported in the original articles.
Results: The accuracy of the AUC values varied with the method used for the construction of the risk models. When logistic regression analysis was used to construct the genetic risk model, AUC values estimated by the simulation method were similar to the published values with a median absolute difference of 0.02 [range: 0.00, 0.04]. This difference was 0.03 [range: 0.01, 0.06] and 0.05 [range: 0.01, 0.08] for unweighted and weighted risk scores.
Conclusions: The predictive ability of genetic risk models can be estimated using simulated data based on results from GWASs. Simulation methods can be useful to estimate the predictive ability in the absence of empirical data and to decide whether empirical investigation of genetic risk models is warranted.
predictive ability; risk prediction; modeling; genetic; AUC; GWAS
The promise of personalized genomics for common complex diseases depends, in part, on the ability to predict genetic risks on the basis of single nucleotide polymorphisms. We examined and compared the methods of three companies (23andMe, deCODEme, and Navigenics) that have offered direct-to-consumer personal genome testing.
We simulated genotype data for 100,000 individuals on the basis of published genotype frequencies and predicted disease risks using the methods of the companies. Predictive ability for six diseases was assessed by the AUC.
AUC values differed among the diseases and among the companies. The highest values of the AUC were observed for age related macular degeneration, celiac disease, and Crohn disease. The largest difference among the companies was found for celiac disease: the AUC was 0.73 for 23andMe and 0.82 for deCODEme. Predicted risks differed substantially among the companies as a result of differences in the sets of single nucleotide polymorphisms selected and the average population risks selected by the companies, and in the formulas used for the calculation of risks.
Future efforts to design predictive models for the genomics of common complex diseases may benefit from understanding the strengths and limitations of the predictive algorithms designed by these early companies.
The rapid and continuing progress in gene discovery for complex diseases is fuelling interest in the potential application of genetic risk models for clinical and public health practice.The number of studies assessing the predictive ability is steadily increasing, but they vary widely in completeness of reporting and apparent quality.Transparent reporting of the strengths and weaknesses of these studies is important to facilitate the accumulation of evidence on genetic risk prediction.A multidisciplinary workshop sponsored by the Human Genome Epidemiology Network developed a checklist of 25 items recommended for strengthening the reporting of Genetic RIsk Prediction Studies (GRIPS), building on the principles established by prior reporting guidelines.These recommendations aim to enhance the transparency, quality and completeness of study reporting, and thereby to improve the synthesis and application of information from multiple studies that might differ in design, conduct or analysis.
Advances in genomics have near-term impact on diagnosis and management of
monogenic disorders. For common complex diseases, the use of genomic information
from multiple loci (polygenic model) is generally not useful for diagnosis and
individual prediction. In principle, the polygenic model could be used along
with other risk factors in stratified population screening to target
interventions. For example, compared to age-based criterion for breast,
colorectal, and prostate cancer screening, adding polygenic risk and family
history holds promise for more efficient screening with earlier start and/or
increased frequency of screening for segments of the population at higher
absolute risk than an established screening threshold; and later start and/or
decreased frequency of screening for segments of the population at lower risks.
This approach, while promising, faces formidable challenges for building its
evidence base and for its implementation in practice. Currently, it is unclear
whether or not polygenic risk can contribute enough discrimination to make
stratified screening worthwhile. Empirical data are lacking on population-based
age-specific absolute risks combining genetic and non-genetic factors, on impact
of polygenic risk genes on disease natural history, as well as information on
comparative balance of benefits and harms of stratified interventions.
Implementation challenges include difficulties in integration of this
information in the current health-care system in the United States, the setting
of appropriate risk thresholds, and ethical, legal, and social issues. In an era
of direct-to-consumer availability of personal genomic information, the public
health and health-care systems need to prepare for an evidence-based integration
of this information into population screening.
evidence-based medicine; genetics; genomics; polygenic model; public health; risk assessment; screening
Finding eligible studies for meta-analysis and systematic reviews relies on keyword-based searching as the gold standard, despite its inefficiency. Searching based on direct citations is not sufficiently comprehensive. We propose a novel strategy that ranks articles on their degree of co-citation with one or more “known” articles before reviewing their eligibility.
In two independent studies, we aimed to reproduce the results of literature searches for sets of published meta-analyses (n = 10 and n = 42). For each meta-analysis, we extracted co-citations for the randomly selected ‘known’ articles from the Web of Science database, counted their frequencies and screened all articles with a score above a selection threshold. In the second study, we extended the method by retrieving direct citations for all selected articles.
In the first study, we retrieved 82 % of the studies included in the meta-analyses while screening only 11 % as many articles as were screened for the original publications. Articles that we missed were published in non-English languages, published before 1975, published very recently, or available only as conference abstracts. In the second study, we retrieved 79 % of included studies while screening half the original number of articles.
Citation searching appears to be an efficient and reasonably accurate method for finding articles similar to one or more articles of interest for meta-analysis and reviews.
Electronic supplementary material
The online version of this article (doi:10.1186/s12874-015-0077-z) contains supplementary material, which is available to authorized users.
Citation; Co-citation; Literature search; Meta-analysis; Systematic review; Keywords
B-type natriuretic peptide (BNP) and C-reactive protein (CRP) predict atrial fibrillation (AF) risk. However, their risk stratification abilities in the broad community remain uncertain. We sought to improve risk stratification for AF using biomarker information.
Methods and results
We ascertained AF incidence in 18 556 Whites and African Americans from the Atherosclerosis Risk in Communities Study (ARIC, n=10 675), Cardiovascular Health Study (CHS, n = 5043), and Framingham Heart Study (FHS, n = 2838), followed for 5 years (prediction horizon). We added BNP (ARIC/CHS: N-terminal pro-B-type natriuretic peptide; FHS: BNP), CRP, or both to a previously reported AF risk score, and assessed model calibration and predictive ability [C-statistic, integrated discrimination improvement (IDI), and net reclassification improvement (NRI)]. We replicated models in two independent European cohorts: Age, Gene/Environment Susceptibility Reykjavik Study (AGES), n = 4467; Rotterdam Study (RS), n = 3203. B-type natriuretic peptide and CRP were significantly associated with AF incidence (n = 1186): hazard ratio per 1-SD ln-transformed biomarker 1.66 [95% confidence interval (CI), 1.56–1.76], P < 0.0001 and 1.18 (95% CI, 1.11–1.25), P < 0.0001, respectively. Model calibration was sufficient (BNP, χ2 = 17.0; CRP, χ2 = 10.5; BNP and CRP, χ2 = 13.1). B-type natriuretic peptide improved the C-statistic from 0.765 to 0.790, yielded an IDI of 0.027 (95% CI, 0.022–0.032), a relative IDI of 41.5%, and a continuous NRI of 0.389 (95% CI, 0.322–0.455). The predictive ability of CRP was limited (C-statistic increment 0.003). B-type natriuretic peptide consistently improved prediction in AGES and RS.
B-type natriuretic peptide, not CRP, substantially improved AF risk prediction beyond clinical factors in an independently replicated, heterogeneous population. B-type natriuretic peptide may serve as a benchmark to evaluate novel putative AF risk biomarkers.
Atrial fibrillation; Risk prediction; Epidemiology; Biomarker; B-type natriuretic peptide; C-reactive protein
Prediction models for age-related macular degeneration (AMD) based on case-control studies have a tendency to overestimate risks. The aim of this study is to develop a prediction model for late AMD based on data from population-based studies.
Three population-based studies: the Rotterdam Study (RS), the Beaver Dam Eye Study (BDES), and the Blue Mountains Eye Study (BMES) from the Three Continent AMD Consortium (3CC).
People (n = 10106) with gradable fundus photographs, genotype data, and follow-up data without late AMD at baseline.
Features of AMD were graded on fundus photographs using the 3CC AMD severity scale. Associations with known genetic and environmental AMD risk factors were tested using Cox proportional hazard analysis. In the RS, the prediction of AMD was estimated for multivariate models by area under receiver operating characteristic curves (AUCs). The best model was validated in the BDES and BMES, and associations of variables were re-estimated in the pooled data set. Beta coefficients were used to construct a risk score, and risk of incident late AMD was calculated using Cox proportional hazard analysis. Cumulative incident risks were estimated using Kaplan–Meier product-limit analysis.
Main Outcome Measures
Incident late AMD determined per visit during a median follow-up period of 11.1 years with a total of 4 to 5 visits.
Overall, 363 participants developed incident late AMD, 3378 participants developed early AMD, and 6365 participants remained free of any AMD. The highest AUC was achieved with a model including age, sex, 26 single nucleotide polymorphisms in AMD risk genes, smoking, body mass index, and baseline AMD phenotype. The AUC of this model was 0.88 in the RS, 0.85 in the BDES and BMES at validation, and 0.87 in the pooled analysis. Individuals with low-risk scores had a hazard ratio (HR) of 0.02 (95% confidence interval [CI], 0.01–0.04) to develop late AMD, and individuals with high-risk scores had an HR of 22.0 (95% CI, 15.2–31.8). Cumulative risk of incident late AMD ranged from virtually 0 to more than 65% for those with the highest risk scores.
Our prediction model is robust and distinguishes well between those who will develop late AMD and those who will not. Estimated risks were lower in these population-based studies than in previous case-control studies.
Personality can be thought of as a set of characteristics that influence people’s thoughts, feelings, and behaviour across a variety of settings. Variation in personality is predictive of many outcomes in life, including mental health. Here we report on a meta-analysis of genome-wide association (GWA) data for personality in ten discovery samples (17 375 adults) and five in-silico replication samples (3 294 adults). All participants were of European ancestry. Personality scores for Neuroticism, Extraversion, Openness to Experience, Agreeableness, and Conscientiousness were based on the NEO Five-Factor Inventory. Genotype data were available of ~2.4M Single Nucleotide Polymorphisms (SNPs; directly typed and imputed using HAPMAP data). In the discovery samples, classical association analyses were performed under an additive model followed by meta-analysis using the weighted inverse variance method. Results showed genome-wide significance for Openness to Experience near the RASA1 gene on 5q14.3 (rs1477268 and rs2032794, P = 2.8 × 10−8 and 3.1 × 10−8) and for Conscientiousness in the brain-expressed KATNAL2 gene on 18q21.1 (rs2576037, P = 4.9 × 10−8). We further conducted a gene-based test that confirmed the association of KATNAL2 to Conscientiousness. In-silico replication did not, however, show significant associations of the top SNPs with Openness and Conscientiousness, although the direction of effect of the KATNAL2 SNP on Conscientiousness was consistent in all replication samples. Larger scale GWA studies and alternative approaches are required for confirmation of KATNAL2 as a novel gene affecting Conscientiousness.
Personality; Five-Factor Model; Genome-wide association; Meta-analysis; Genetic variants
The discrimination of a risk prediction model measures that model's ability to distinguish between subjects with and without events. The area under the receiver operating characteristic curve (AUC) is a popular measure of discrimination. However, the AUC has recently been criticized for its insensitivity in model comparisons in which the baseline model has performed well. Thus, 2 other measures have been proposed to capture improvement in discrimination for nested models: the integrated discrimination improvement and the continuous net reclassification improvement. In the present study, the authors use mathematical relations and numerical simulations to quantify the improvement in discrimination offered by candidate markers of different strengths as measured by their effect sizes. They demonstrate that the increase in the AUC depends on the strength of the baseline model, which is true to a lesser degree for the integrated discrimination improvement. On the other hand, the continuous net reclassification improvement depends only on the effect size of the candidate variable and its correlation with other predictors. These measures are illustrated using the Framingham model for incident atrial fibrillation. The authors conclude that the increase in the AUC, integrated discrimination improvement, and net reclassification improvement offer complementary information and thus recommend reporting all 3 alongside measures characterizing the performance of the final model.
area under curve; biomarkers; discrimination; risk assessment; risk factors
In recent years, developments in genomics technologies have led to the rise of commercial personal genome testing (PGT): broad genome-wide testing for multiple diseases simultaneously. While some commercial providers require physicians to order a personal genome test, others can be accessed directly. All providers advertise directly to consumers and offer genetic risk information about dozens of diseases in one single purchase. The quantity and the complexity of risk information pose challenges to adequate pre-test and post-test information provision and informed consent. There are currently no guidelines for what should constitute informed consent in PGT or how adequate informed consent can be achieved. In this paper, we propose a tiered-layered-staged model for informed consent. First, the proposed model is tiered as it offers choices between categories of diseases that are associated with distinct ethical, personal or societal issues. Second, the model distinguishes layers of information with a first layer offering minimal, indispensable information that is material to all consumers, and additional layers offering more detailed information made available upon request. Finally, the model stages informed consent as a process by feeding information to consumers in each subsequent stage of the process of undergoing a test, and by accommodating renewed consent for test result updates, resulting from the ongoing development of the science underlying PGT. A tiered-layered-staged model for informed consent with a focus on the consumer perspective can help overcome the ethical problems of information provision and informed consent in direct-to-consumer PGT.
personal genome testing; informed consent; ethical issues; complex diseases
The objective of this paper is to assess parental beliefs and intentions about genetic testing for their children in a multi-ethnic population with the aim of acquiring information to guide interventions for obesity prevention and management. A cross-sectional survey was conducted in parents of native Dutch children and children from a large minority population (Turks) selected from Youth Health Care registries. The age range of the children was 5–11 years. Parents with lower levels of education and parents of non-native children were more convinced that overweight has a genetic cause and their intentions to test the genetic predisposition of their child to overweight were firmer. A firmer intention to test the child was associated with the parents’ perceptions of their child’s susceptibility to being overweight, a positive attitude towards genetic testing, and anticipated regret at not having the child tested while at risk for overweight. Interaction effects were found in ethnic and socio-economic groups. Ethnicity and educational level play a role in parental beliefs about child overweight and genetic testing. Education programmes about obesity risk, genetic testing and the importance of behaviour change should be tailored to the cultural and behavioural factors relevant to ethnic and socio-economic target groups.
Genetics; Attitude; Health promotion; Obesity; Child
Human longevity and personality traits are both heritable and are consistently linked at the phenotypic level. We test the hypothesis that candidate genes influencing longevity in lower organisms are associated with variance in the five major dimensions of human personality (measured by the NEO-FFI and IPIP inventories) plus related mood states of anxiety and depression. Seventy single nucleotide polymorphisms (SNPs) in six brain expressed, longevity candidate genes (AFG3L2, FRAP1, MAT1A, MAT2A, SYNJ1 and SYNJ2) were typed in over one thousand 70-year old participants from the Lothian Birth Cohort of 1936 (LBC1936). No SNPs were associated with the personality and psychological distress traits at a Bonferroni corrected level of significance (p < 0.0002), but there was an over-representation of nominally significant (p < 0.05) SNPs in the synaptojanin-2 (SYNJ2) gene associated with agreeableness and symptoms of depression. Eight SNPs which showed nominally significant association across personality measurement instruments were tested in an extremely large replication sample of 17 106 participants. SNP rs350292, in SYNJ2, was significant: the minor allele was associated with an average decrease in NEO agreeableness scale scores of 0.25 points, and 0.67 points in the restricted analysis of elderly cohorts (most aged > 60 years). Because we selected a specific set of longevity genes based on functional genomics findings, further research on other longevity gene candidates is warranted to discover whether they are relevant candidates for personality and psychological distress traits.
NEO personality; IPIP personality; anxiety; depressive symptoms; ageing; genetics
Hypertension is an important determinant of cardiovascular morbidity and mortality and has a substantial heritability, which is likely of polygenic origin. The aim of this study was to assess to what extent multiple common genetic variants contribute to blood pressure regulation in both adults and children, and to assess overlap in variants between different age groups, using genome wide profiling. SNP sets were defined based on a meta-analysis of genome-wide association studies on systolic (SBP) and diastolic blood pressure (DBP) performed by the Cohort for Heart and Aging Research in Genome Epidemiology (CHARGE, n=29,136), using different P-value thresholds for selecting single nucleotide polymorphisms (SNPs). Subsequently, genetic risk scores for SBP and DBP were calculated in an independent adult population (n=2,072) and a child population (n=1,034). The explained variance of the genetic risk scores was evaluated using linear regression models, including sex, age and body mass index. Genetic risk scores, including also many non-genome-wide significant SNPs explained more of the variance than scores based only on very significant SNPs in adults and children. Genetic risk scores significantly explained up to 1.2% (P=9.6*10−8) of the variance in adult SBP and 0.8% (P=0.004) in children. For DBP, the variance explained was similar in adults and children (1.7% (P=8.9*10−10) and 1.4% (P=3.3*10−5) respectively). These findings suggest the presence of many genetic loci with small effects on blood pressure regulation both in adults and children, indicating also a (partly) common polygenic regulation of blood pressure throughout different periods of life.
genome-wide association; genome-wide profiling; genetic risk scores; blood pressure; hypertension
An Essay by A. Cecile Janssens and Peter Kraft discusses the limitations inherent in research involving collection of self-reported data by self-selected participants, and makes proposals for upfront communication of such limitations to study participants.
The quality and quantity of food intake affect body weight, but little is known about the genetics of such human dietary intake patterns in relation to the genetics of BMI. We aimed to estimate the heritability of dietary intake patterns and genetic correlation with BMI in participants of the Erasmus Rucphen Family study. The study included 1,690 individuals (42 % men; age range, 19–92), of whom 41.4 % were overweight and 15.9 % were obese. Self-report questionnaires were used to assess the number of days (0–7) on which participants consumed vegetables, fruit, fruit juice, fish, unhealthy snacks, fastfood, and soft drinks. Principal component analysis was applied to examine the correlations between the questionnaire items and to generate dietary intake pattern scores. Heritability and the shared genetic and shared non-genetic (environmental) correlations were estimated using the family structure of the cohort. Principal component analysis suggested that the questionnaire items could be grouped in a healthy and unhealthy dietary intake pattern, explaining 22 and 18 % of the phenotypic variance, respectively. The dietary intake patterns had a heritability of 0.32 for the healthy and 0.27 for the unhealthy pattern. Genetic correlations between the dietary intake patterns and BMI were not significant, but we found a significant environmental correlation between the unhealthy dietary intake pattern and BMI. Specific dietary intake patterns are associated with the risk of obesity and are heritable traits. The genetic factors that determine specific dietary intake patterns do not significantly overlap with the genetic factors that determine BMI.
Electronic supplementary material
The online version of this article (doi:10.1007/s00592-012-0387-0) contains supplementary material, which is available to authorized users.
Heritability; BMI; Food intake
Phospho- and sphingolipids are crucial cellular and intracellular compounds. These lipids are required for active transport, a number of enzymatic processes, membrane formation, and cell signalling. Disruption of their metabolism leads to several diseases, with diverse neurological, psychiatric, and metabolic consequences. A large number of phospholipid and sphingolipid species can be detected and measured in human plasma. We conducted a meta-analysis of five European family-based genome-wide association studies (N = 4034) on plasma levels of 24 sphingomyelins (SPM), 9 ceramides (CER), 57 phosphatidylcholines (PC), 20 lysophosphatidylcholines (LPC), 27 phosphatidylethanolamines (PE), and 16 PE-based plasmalogens (PLPE), as well as their proportions in each major class. This effort yielded 25 genome-wide significant loci for phospholipids (smallest P-value = 9.88×10−204) and 10 loci for sphingolipids (smallest P-value = 3.10×10−57). After a correction for multiple comparisons (P-value<2.2×10−9), we observed four novel loci significantly associated with phospholipids (PAQR9, AGPAT1, PKD2L1, PDXDC1) and two with sphingolipids (PLD2 and APOE) explaining up to 3.1% of the variance. Further analysis of the top findings with respect to within class molar proportions uncovered three additional loci for phospholipids (PNLIPRP2, PCDH20, and ABDH3) suggesting their involvement in either fatty acid elongation/saturation processes or fatty acid specific turnover mechanisms. Among those, 14 loci (KCNH7, AGPAT1, PNLIPRP2, SYT9, FADS1-2-3, DLG2, APOA1, ELOVL2, CDK17, LIPC, PDXDC1, PLD2, LASS4, and APOE) mapped into the glycerophospholipid and 12 loci (ILKAP, ITGA9, AGPAT1, FADS1-2-3, APOA1, PCDH20, LIPC, PDXDC1, SGPP1, APOE, LASS4, and PLD2) to the sphingolipid pathways. In large meta-analyses, associations between FADS1-2-3 and carotid intima media thickness, AGPAT1 and type 2 diabetes, and APOA1 and coronary artery disease were observed. In conclusion, our study identified nine novel phospho- and sphingolipid loci, substantially increasing our knowledge of the genetic basis for these traits.
Phospho- and sphingolipids are integral to membrane formation and are involved in crucial cellular functions such as signalling, membrane fluidity, membrane protein trafficking, neurotransmission, and receptor trafficking. In addition to severe monogenic diseases resulting from defective phospho- and sphingolipid function and metabolism, the evidence suggests that variations in these lipid levels at the population level are involved in the determination of cardiovascular and neurologic traits and subsequent disease. We took advantage of modern laboratory methods, including microarray-based genotyping and electrospray ionization tandem mass spectrometry, to hunt for genetic variation influencing the levels of more than 350 phospho- and sphingolipid phenotypes. We identified nine novel loci, in addition to confirming a number of previously described loci. Several other genetic regions provided substantial evidence of their involvement in these traits. All of these loci are strong candidates for further research in the field of lipid biology and are likely to yield considerable insights into the complex metabolic pathways underlying circulating phospho- and sphingolipid levels. Understanding these mechanisms might help to illuminate factors leading to the development of common cardiovascular and neurological diseases and might provide molecular targets for the development of new therapies.
A recent collaborative genome-wide association study replicated a large number of susceptibility loci and identified novel loci. This increase in known multiple sclerosis (MS) risk genes raises questions about clinical applicability of genotyping. In an empirical set we assessed the predictive power of typing multiple genes. Next, in a modelling study we explored current and potential predictive performance of genetic MS risk models.
Materials and Methods
Genotype data on 6 MS risk genes in 591 MS patients and 600 controls were used to investigate the predictive value of combining risk alleles. Next, the replicated and novel MS risk loci from the recent and largest international genome-wide association study were used to construct genetic risk models simulating a population of 100,000 individuals. Finally, we assessed the required numbers, frequencies, and ORs of risk SNPs for higher discriminative accuracy in the future.
Individuals with 10 to 12 risk alleles had a significantly increased risk compared to individuals with the average population risk for developing MS (OR 2.76 (95% CI 2.02–3.77)). In the simulation study we showed that the area under the receiver operating characteristic curve (AUC) for a risk score based on the 6 SNPs was 0.64. The AUC increases to 0.66 using the well replicated 24 SNPs and to 0.69 when including all replicated and novel SNPs (n = 53) in the risk model. An additional 20 SNPs with allele frequency 0.30 and ORs 1.1 would be needed to increase the AUC to a slightly higher level of 0.70, and at least 50 novel variants with allele frequency 0.30 and ORs 1.4 would be needed to obtain an AUC of 0.85.
Although new MS risk SNPs emerge rapidly, the discriminatory ability in a clinical setting will be limited.
Fueled by the successes of genome-wide association studies, numerous studies have investigated the predictive ability of genetic risk models in type 2 diabetes. In this paper, we review these studies from a methodological perspective, focusing on the variables included in the risk models as well as the study designs and populations investigated. We argue and show that differences in study design and characteristics of the study population have an impact on the observed predictive ability of risk models. This observation emphasizes that genetic risk prediction studies should be conducted in those populations in which the prediction models will ultimately be applied, if proven useful. Of all genetic risk prediction studies to date, only a few were conducted in populations that might be relevant for targeting preventive interventions.
Genetic predisposition; Risk prediction; Type 2 diabetes; Public health; Risk factors; Prevention
Genome wide association studies (GWAS) have recently identified CLU, PICALM and CR1 as novel genes for late-onset Alzheimer’s disease (AD).
In a three-stage analysis of new and previously published GWAS on over 35000 persons (8371 AD cases), we sought to identify and strengthen additional loci associated with AD and confirm these in an independent sample. We also examined the contribution of recently identified genes to AD risk prediction.
Design, Setting, and Participants
We identified strong genetic associations (p<10−3) in a Stage 1 sample of 3006 AD cases and 14642 controls by combining new data from the population-based Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium (1367 AD cases (973 incident)) with previously reported results from the Translational Genomics Research Institute (TGEN) and Mayo AD GWAS. We identified 2708 single nucleotide polymorphisms (SNPs) with p-values<10−3, and in Stage 2 pooled results for these SNPs with the European AD Initiative (2032 cases, 5328 controls) to identify ten loci with p-values<10−5. In Stage 3, we combined data for these ten loci with data from the Genetic and Environmental Risk in AD consortium (3333 cases, 6995 controls) to identify four SNPs with a p-value<1.7×10−8. These four SNPs were replicated in an independent Spanish sample (1140 AD cases and 1209 controls).
Main outcome measure
We showed genome-wide significance for two new loci: rs744373 near BIN1 (OR:1.13; 95%CI:1.06–1.21 per copy of the minor allele; p=1.6×10−11) and rs597668 near EXOC3L2/BLOC1S3/MARK4 (OR:1.18; 95%CI1.07–1.29; p=6.5×10−9). Associations of CLU, PICALM, BIN1 and EXOC3L2 with AD were confirmed in the Spanish sample (p<0.05). However, CLU and PICALM did not improve incident AD prediction beyond age, sex, and APOE (improvement in area under receiver-operating-characteristic curve <0.003).
Two novel genetic loci for AD are reported that for the first time reach genome-wide statistical significance; these findings were replicated in an independent population. Two recently reported associations were also confirmed, but these loci did not improve AD risk prediction, although they implicate biological pathways that may be useful targets for potential interventions.
genome-wide association study; genetic epidemiology; genetics; dementia; Alzheimer’s disease; cohort study; meta-analysis; risk
The rapid identification of genetic markers for multifactorial diseases from genome-wide association studies is fuelling interest in investigating the predictive ability and health care utility of genetic risk models. Various measures are available for the assessment of risk prediction models, each addressing a different aspect of performance and utility. We developed PredictABEL, a package in R that covers descriptive tables, measures and figures that are used in the analysis of risk prediction studies such as measures of model fit, predictive ability and clinical utility, and risk distributions, calibration plot and the receiver operating characteristic plot. Tables and figures are saved as separate files in a user-specified format, which include publication-quality EPS and TIFF formats. All figures are available in a ready-made layout, but they can be customized to the preferences of the user. The package has been developed for the analysis of genetic risk prediction studies, but can also be used for studies that only include non-genetic risk factors. PredictABEL is freely available at the websites of GenABEL (http://www.genabel.org) and CRAN (http://cran.r-project.org/).
Risk prediction; Genetic; Assessment; Measures; Software
The rapid and continuing progress in gene discovery for complex diseases is fueling interest in the potential application of genetic risk models for clinical and public health practice. The number of studies assessing the predictive ability is steadily increasing, but the quality and completeness of reporting varies. A multidisciplinary workshop sponsored by the Human Genome Epidemiology Network developed a checklist of 25 items recommended for strengthening the reporting of Genetic RIsk Prediction Studies (GRIPS), building on the principles established by prior reporting guidelines. These recommendations aim to enhance the transparency of study reporting, and thereby to improve the synthesis and application of information from multiple studies that might differ in design, conduct, or analysis. A detailed Explanation and Elaboration document is published.
Genetic; Risk prediction; Methodology; Guidelines; Reporting
The rapid and continuing progress in gene discovery for complex diseases is fuelling interest in the potential application of genetic risk models for clinical and public health practice. The number of studies assessing the predictive ability is steadily increasing, but they vary widely in completeness of reporting and apparent quality. Transparent reporting of the strengths and weaknesses of these studies is important to facilitate the accumulation of evidence on genetic risk prediction. A multidisciplinary workshop sponsored by the Human Genome Epidemiology Network developed a checklist of 25 items recommended for strengthening the reporting of Genetic RIsk Prediction Studies (GRIPS), building on the principles established by prior reporting guidelines. These recommendations aim to enhance the transparency, quality and completeness of study reporting, and thereby to improve the synthesis and application of information from multiple studies that might differ in design, conduct or analysis.
Genetic; Risk prediction; Methodology; Guidelines; Reporting
Cecile Janssens and colleagues present the GRIPS Statement, a checklist to help strengthen the reporting of genetic risk prediction studies.