The general public does not appear to be aware that, despite their very similar height and appearance, monozygotic twins in general do not always develop or die from the same maladies (35
). This basic observation, that monozygotic twins of a pair are not always afflicted by the same maladies, combined with extensive epidemiologic studies of twins and statistical modeling, allows us to estimate upper- and lower-bounds of the predictive value of whole-genome sequencing.
On the negative side, our results show that the majority of tested individuals would receive negative tests for most diseases (). Moreover, the predictive value of these negative tests would generally be small, as the total risk for acquiring the disease in an individual testing negative would be similar to that of the general population (). On the positive side, our results show that, at least in the best-case scenario, the majority of patients might be alerted to a clinically meaningful risk for at least one disease through whole-genome sequencing.
These conclusions are consistent with what is now known about risk allele loci from genome-wide association studies (GWAS) (37
). In general, GWAS have shown that many loci can predispose to disease and that each risk allele confers a relatively small effect (38
). For example, a recent analysis of large cohorts of individuals with colorectal cancer showed that only ~1.3% of phenotypic variance could be accounted for by the 10 loci discovered through GWAS (40
). However, it could be argued that the relatively low level of utility that might be inferred from such studies is misleading. In particular, it is possible that a more complete knowledge of disease-associated variants and their epistatic relationships would be able to reliably predict who will and who will not develop disease in the general population. Our results allow us to estimate the maximum possible
reliability of such tests.
Several of our conclusions are based on the genometype frequency and risk distributions that would maximize the clinical utility of genetic testing, i.e., are best-case scenarios. The actual frequency and risk distributions of genometypes in the population are not likely to be distributed in this way. Indeed, other distributions are also consistent with the monozygotic twin data on which our maxima are determined and all other distributions yield less clinical utility than those of the maxima, as shown in to . Moreover, in the real world, it is unlikely that the biomedical correlates of every genetic variant and the epistatic relationships among these variants will ever be completely known, or that the analytic validity of genetic testing will be perfect - as we assume in our ideal scenario. Thus, our conclusions purposely overestimate the value of whole-genome sequencing that will be achieved - they represent an absolute upper bound that cannot be improved by improvements in technology or genetic knowledge. As a practical example of this principle, we estimate that a negative whole-genome sequencing-based test could indicate a ~ two-fold decrease in risk for prostate cancer in men and a similar two-fold decrease for urinary incontinence in women. But this two-fold decrease would only apply in a world in which the risk alleles are distributed in a fashion that maximizes the sensitivity of whole genome testing (). In the real world, the risk alleles are not likely to be distributed in this ideal fashion, and omniscience about every variant is not likely to be realized. Thus, the risk of these diseases in patients who test negative will likely be even more similar to that of the general population. For diseases with a lower heritable component, such as most forms of cancer, whole-genome based genetic tests will be even less informative. Thus, our results suggest that genetic testing, at its best, will not be the dominant determinant of patient care and will not be a substitute for preventative medicine strategies incorporating routine checkups and risk management based on the history, physical status and life style of the patient.
It is important to point out that our study focused on testing relatively common diseases in the general population and did not address the utility of whole-genome sequencing to identify the genetic basis of rare monogenic diseases. In such unusual cases, it has already been shown that whole-genome sequencing can prove highly informative (8
As with any model-based study, our conclusions have a number of caveats. Our analyses are based on data from twin studies and the assumptions made therein (11
). Specifically, we do not model gene-environment interactions and rely on the prevalence of disease in the twin cohorts; this prevalence, as well as the operative non-genetic contributions, may differ from that in the general population. Though twins are likely to be representative of the general population, the estimates provided by our model could be improved through analyses of larger twin cohorts as these become available, as well as through a more complete phenotypic evaluation of twins of varying ethnicities. Another caveat is that our conclusions about potential utility are based on thresholds that represent a complex balance of personal choices, demographic influences, disease characteristics and the clinical intervention(s) available. We have used a minimum 10% total risk and a minimum relative risk of 2 as the threshold in our analyses. Other thresholds may be more appropriate and meaningful for given situations, though the data in table S4 to table S6
show that our major conclusions are not altered much by the choice of threshold.
In sum, no result, including ours, can or should be used to conclude that whole-genome sequencing will be either useful or useless in an absolute sense. This utility will depend on the results of testing, the individual tested, and the perspectives of individuals and societies. What we hoped to accomplish with this study is to put the debate about the value of such sequencing in a mathematical framework so that the potential merits and limitations of whole-genome sequencing, for any disease, can be quantitatively assessed. Recognition of these merits and limits can be useful to consumers, researchers, and industry, as they can minimize unrealistic expectations and foster fruitful investigations.