The determination of the ancestry and genetic backgrounds of the subjects in genetic and general epidemiology studies is a crucial component in the analysis of relevant outcomes or associations. Although there are many methods for differentiating ancestral subgroups among individuals based on genetic markers only a few of these methods provide actual estimates of the fraction of an individual’s genome that is likely to be associated with different ancestral populations. We propose a method for assigning ancestry that works in stages to refine estimates of ancestral population contributions to individual genomes. The method leverages genotype data in the public domain obtained from individuals with known ancestries. Although we showcase the method in the assessment of ancestral genome proportions leveraging largely continental populations, the strategy can be used for assessing within-continent or more subtle ancestral origins with the appropriate data.
doi:10.3389/fgene.2012.00322
PMCID: PMC3543981
PMID: 23335941
genetic ancestry; admixture; population genetics; admixture proportions
The ongoing controversy surrounding direct-to-consumer (DTC) personal genomic tests intensified last year when the U.S. Government Accountability Office (GAO) released results of an undercover investigation of four companies that offer such testing. Among their findings, they reported that some of their donors received DNA-based predictions that conflicted with their actual medical histories. We aimed to more rigorously evaluate the relationship between DTC genomic risk estimates and self-reported disease by leveraging data from the Scripps Genomic Health Initiative (SGHI). We prospectively collected self-reported personal and family health history data for 3,416 individuals who went on to purchase a commercially available DTC genomic test. For 5 out of 15 total conditions studied, we found that risk estimates from the test were significantly associated with self-reported family and/or personal health history. The 5 conditions, included Graves’ disease, Type 2 Diabetes, Lupus, Alzheimer’s disease, and Restless Leg Syndrome. To further investigate these findings, we ranked each of the 15 conditions based on published heritability estimates and conducted post-hoc power analyses based on the number of individuals in our sample who reported significant histories of each condition. We found that high heritability, coupled with high prevalence in our sample and thus adequate statistical power, explained the pattern of associations observed. Our study represents one of the first evaluations of the relationship between risk estimates from a commercially available DTC personal genomic test and self-reported health histories in the consumers of that test.
doi:10.1002/gepi.20664
PMCID: PMC3338895
PMID: 22127769
direct-to-consumer; genetic testing; genetic risk estimates; clinical validity; consumer genomics
Multivariate distance matrix regression (MDMR) analysis is a statistical technique that allows researchers to relate P variables to an additional M factors collected on N individuals, where P ≫ N. The technique can be applied to a number of research settings involving high-dimensional data types such as DNA sequence data, gene expression microarray data, and imaging data. MDMR analysis involves computing the distance between all pairs of individuals with respect to P variables of interest and constructing an N × N matrix whose elements reflect these distances. Permutation tests can be used to test linear hypotheses that consider whether or not the M additional factors collected on the individuals can explain variation in the observed distances between and among the N individuals as reflected in the matrix. Despite its appeal and utility, properties of the statistics used in MDMR analysis have not been explored in detail. In this paper we consider the level accuracy and power of MDMR analysis assuming different distance measures and analysis settings. We also describe the utility of MDMR analysis in assessing hypotheses about the appropriate number of clusters arising from a cluster analysis.
doi:10.3389/fgene.2012.00190
PMCID: PMC3461701
PMID: 23060897
regression analysis; multivariate analysis; distance matrix; simulation
Damani, Samir | Bacconi, Andrea | Libiger, Ondrej | Chourasia, Aparajita H. | Serry, Rod | Gollapudi, Raghava | Goldberg, Ron | Rapeport, Kevin | Haaser, Sharon | Topol, Sarah | Knowlton, Sharen | Bethel, Kelly | Kuhn, Peter | Wood, Malcolm | Carragher, Bridget | Schork, Nicholas J. | Jiang, John | Rao, Chandra | Connelly, Mark | Fowler, Velia M. | Topol, Eric J.
Acute myocardial infarction (MI), which involves the rupture of existing atheromatous plaque, remains highly unpredictable despite recent advances in the diagnosis and treatment of coronary artery disease. Accordingly, a biomarker that can predict an impending MI is desperately needed. Here, we characterize circulating endothelial cells (CECs) using the first automated and clinically feasible CEC 3-channel fluorescence microscopy assay in 50 consecutive patients with ST-elevation myocardial infarction (STEMI) and 44 consecutive healthy controls. CEC counts were significantly elevated in MI cases versus controls with median numbers of 19 and 4 cells/ml respectively (p = 1.1 × 10−10). A receiver-operating characteristic (ROC) curve analysis demonstrated an area under the ROC curve of 0.95, suggesting near dichotomization of MI cases versus controls. We observed no correlation between CECs and typical markers of myocardial necrosis (ρ=0.02, CK-MB; ρ=−0.03, troponin). Morphologic analysis of the microscopy images of CECs revealed a 2.5-fold increase (P<0.0001) in cellular area and 2-fold increase (P<0.0001) in nuclear area of MI CECs versus healthy control, age-matched CECs, as well as CECs obtained from patients with preexisting peripheral vascular disease. The distribution of CEC images containing from 2 up to 10 nuclei demonstrates that MI patients are the only group to contain more than 3 nuclei/image, indicating that multi-cellular and multi-nuclear clusters are specific for acute MI. These data indicate that CECs may serve as promising biomarkers for the prediction of atherosclerotic plaque rupture events.
doi:10.1126/scitranslmed.3003451
PMCID: PMC3589570
PMID: 22440735
Background.
Hyperhomocysteinemia is associated with increased venous thrombosis and cardiovascular disease (CVD). Mutations in the human methylenetetrahydrofolate reductase (MTHFR) gene have been associated with increased homocysteine levels and risks of CVD in various populations including those with kidney disease. Here, we evaluated the influence of MTHFR variants on progressive loss of kidney function.
Methods.
We analyzed 821 subjects with hypertensive nephrosclerosis from the longitudinal National Institute of Diabetes and Digestive and Kidney Diseases African-American Study of Kidney Disease and Hypertension (AASK) Trial to determine whether decline in glomerular filtration rate (GFR) over ∼4.2 years was predicted by common genetic variation within MTHFR at non-synonymous positions C677T (Ala222Val) and A1298C (Glu429Ala) or by MTHFR haplotypes. The effect on GFR decline was then supported by a study of 1333 subjects from the San Diego Veterans Affairs Hypertension Cohort (VAHC), followed over ∼4.5 years. Linear effect models were utilized to determine both genotype [single-nucleotide polymorphism (SNP)] and genotype (SNP)-by-time interactions.
Results.
In AASK, the polymorphism at A1298C predicted the rate of GFR decline: A1298/A1298 major allele homozygosity resulted in a less pronounced decline of GFR, with a significant SNP-by-time interaction. An independent follow-up study in the San Diego VAHC subjects supports that A1298/A1298 homozygotes have the greatest estimated GFR throughout the study. Haplotype analysis with C677T yielded concurring results.
Conclusion.
We conclude that the MTHFR-coding polymorphism at A1298C is associated with renal decline in African-Americans with hypertensive nephrosclerosis and is supported by a veteran cohort with a primary care diagnosis of hypertension. Further investigation is needed to confirm such findings and to determine what molecular mechanism may contribute to this association.
doi:10.1093/ndt/gfr257
PMCID: PMC3350339
PMID: 21613384
AASK; glomerular filtration rate; hypertension; kidney disease; MTHFR
doi:10.1016/j.brainresbull.2010.04.012
PMCID: PMC2941546
PMID: 20433907
There have been a number of recent successes in the use of whole genome sequencing and sophisticated bioinformatics techniques to identify pathogenic DNA sequence variants responsible for individual idiopathic congenital conditions. However, the success of this identification process is heavily influenced by the ancestry or genetic background of a patient with an idiopathic condition. This is so because potential pathogenic variants in a patient’s genome must be contrasted with variants in a reference set of genomes made up of other individuals’ genomes of the same ancestry as the patient. We explored the effect of ignoring the ancestries of both an individual patient and the individuals used to construct reference genomes. We pursued this exploration in two major steps. We first considered variation in the per-genome number and rates of likely functional derived (i.e., non-ancestral, based on the chimp genome) single nucleotide variants and small indels in 52 individual whole human genomes sampled from 10 different global populations. We took advantage of a suite of computational and bioinformatics techniques to predict the functional effect of over 24 million genomic variants, both coding and non-coding, across these genomes. We found that the typical human genome harbors ∼5.5–6.1 million total derived variants, of which ∼12,000 are likely to have a functional effect (∼5000 coding and ∼7000 non-coding). We also found that the rates of functional genotypes per the total number of genotypes in individual whole genomes differ dramatically between human populations. We then created tables showing how the use of comparator or reference genome panels comprised of genomes from individuals that do not have the same ancestral background as a patient can negatively impact pathogenic variant identification. Our results have important implications for clinical sequencing initiatives.
doi:10.3389/fgene.2012.00211
PMCID: PMC3485509
PMID: 23125845
clinical sequencing; congenital disease; whole genome sequencing; population genetics
Over the past 18 months, there have been notable developments in the direct-to-consumer (DTC) genomic testing arena, in particular with regard to issues surrounding governmental regulation in the USA. While commentaries continue to proliferate on this topic, actual empirical research remains relatively scant. In terms of DTC genomic testing for disease susceptibility, most of the research has centered on uptake, perceptions and attitudes toward testing among health care professionals and consumers. Only a few available studies have examined actual behavioral response among consumers, and we are not aware of any studies that have examined response to DTC genetic testing for ancestry or for drug response. We propose that further research in this area is desperately needed, despite challenges in designing appropriate studies given the rapid pace at which the field is evolving. Ultimately, DTC genomic testing for common markers and conditions is only a precursor to the eventual cost-effectiveness and wide availability of whole genome sequencing of individuals, although it remains unclear whether DTC genomic information will still be attainable. Either way, however, current knowledge needs to be extended and enhanced with respect to the delivery, impact and use of increasingly accurate and comprehensive individualized genomic data.
doi:10.1093/hmg/ddr349
PMCID: PMC3179383
PMID: 21828075
Background
Human skull and brain morphology are strongly influenced by genetic factors, and skull size and shape vary worldwide. However, the relationship between specific brain morphology and genetically-determined ancestry is largely unknown.
Methods
We used two independent data sets to characterize variation in skull and brain morphology among individuals of European ancestry. The first data set is a historical sample of 1,170 male skulls with 37 shape measurements drawn from 27 European populations. The second data set includes 626 North American individuals of European ancestry participating in the Alzheimer's Disease Neuroimaging Initiative (ADNI) with magnetic resonance imaging, height and weight, neurological diagnosis, and genome-wide single nucleotide polymorphism (SNP) data.
Results
We found that both skull and brain morphological variation exhibit a population-genetic fingerprint among individuals of European ancestry. This fingerprint shows a Northwest to Southeast gradient, is independent of body size, and involves frontotemporal cortical regions.
Conclusion
Our findings are consistent with prior evidence for gene flow in Europe due to historical population movements and indicate that genetic background should be considered in studies seeking to identify genes involved in human cortical development and neuropsychiatric disease.
doi:10.1159/000330168
PMCID: PMC3171282
PMID: 21849792
Biological anthropology; Cortex; Craniometry; Genetic drift; Imaging genomics; Neuroimaging; Population genetics
Individuals can now obtain their personal genomic information via direct-to-consumer genetic testing, but what, if any, impact will this have on their lifestyle and health? A recent longitudinal cohort study of individuals who underwent consumer genome scanning found minimal impacts of testing on risk-reducing lifestyle behaviors, such as diet and exercise. These results raise an important question: is personal genomic information likely to beneficially impact public health through motivation of lifestyle behavioral change? In this article, we review the literature on lifestyle behavioral change in response to genetic testing for common disease susceptibility variants. We find that only a few studies have been carried out, and that those that have been done have yielded little evidence to suggest that the mere provision of genetic information alone results in widespread changes in lifestyle health behaviors. We suggest that further study of this issue is needed, in particular studies that examine response to multiplex testing for multiple genetic markers and conditions. This will be critical as we anticipate the wide availability of whole-genome sequencing and more comprehensive phenotyping of individuals. We also note that while simple communication of genomic information and disease susceptibility may be sufficient to catalyze lifestyle changes in some highly motivated groups of individuals, for others, additional strategies may be required to prompt changes, including more sophisticated means of risk communication (e.g., in the context of social norm feedback) either alone or in combination with other promising interventions (e.g., real-time wireless health monitoring devices).
doi:10.2217/pme.11.73
PMCID: PMC3244209
PMID: 22199991
behavioral intervention; consumer genomics; direct-to-consumer; genetic risk; genetic testing; nudging; personalized medicine; social norm feedback; wireless monitoring
A number of recent genome-wide association (GWA) studies have identified unequivocal statistical associations between inherited genetic variations, mostly single nucleotide polymorphisms (SNPs), and common complex diseases such as diabetes, cardiovascular disease, and obesity. Genotyping individuals for these variations has the potential to help redefine how pharmacologic agents undergo clinical development. By identifying carriers of known genomic variants that contribute to susceptibility, a high risk population can be defined as well as individuals with potential for a better response to a drug. We evaluated the potential utility that selecting individuals for a trial on the basis of genotype identified in contemporary GWA studies would have had on recently described clinical trials. We pursued this by constraining both the risks of a disease outcome associated with particular genotypes and overall drug responses to those actually observed in genetic association and clinical trial studies, respectively. We pursued these evaluations in the context of clinical trials investigating drugs for macular degeneration, obesity, heart disease, type II diabetes, prostate cancer and Alzheimer’s disease. We show that the increase in incidence of outcomes in trials restricted to individuals with specific genotypic profiles can result in substantial reductions in requisite sample sizes for such trials. In addition, we also derive realistic bounds for samples sizes for clinical trials investigating pharmacogenetic effects that leverage genetic variations identified in recent association studies.
doi:10.1080/10543400903572779
PMCID: PMC2892229
PMID: 20309761
Polymorphism; Translational medicine; Drug validation; DNA sequencing; Study Design
Schwimmer, Jeffrey B. | Celedon, Manuel A. | Lavine, Joel E. | Salem, Rany | Campbell, Nzali | Schork, Nicholas J. | Shiehmorteza, Masoud | Yokoo, Takeshi | Chavez, Alyssa | Middleton, Michael S. | Sirlin, Claude B.
Background & Aims
Nonalcoholic fatty liver disease (NAFLD) is the most common chronic liver disease in the United States. The etiology is believed to be multi-factorial with a substantial genetic component; however, the heritability of NAFLD is undetermined. Therefore, a familial aggregation study was performed to test the hypothesis that NAFLD is highly heritable.
Methods
Overweight children with biopsy-proven NAFLD and overweight children without NAFLD served as probands. Family members were studied including magnetic resonance imaging to quantify liver fat fraction. Fatty liver was defined as a liver fat fraction ≥ 5%. Etiologies for fatty liver other than NAFLD were excluded. Narrow-sense heritability estimates for fatty liver (dichotomous) and fat fraction (continuous) were calculated using variance components analysis adjusted for covariate effects.
Results
Fatty liver was present in 17% of siblings and 37% of parents of overweight children without NAFLD. Fatty liver was significantly more common in siblings (59%) and parents (78%) of children with NAFLD. Liver fat fraction was correlated with body mass index (BMI), although the correlation was significantly stronger for families of children with NAFLD than those without NAFLD. Adjusted for age, sex, race, and BMI, heritability of fatty liver was 1.000 and of liver fat fraction 0.386.
Conclusion
Family members of children with NAFLD should be considered at high risk for NAFLD. These data suggest that familial factors are a major determinant of whether an individual has NAFLD. Studies examining the complex relations between genes and environment in the development and progression of NAFLD are warranted.
doi:10.1053/j.gastro.2009.01.050
PMCID: PMC3397140
PMID: 19208353
magnetic resonance; genetics; family; obesity; fatty liver
Recent studies investigating the genetic determinants of cancer suggest that some of the genetic alterations contributing to tumorigenesis may be inherited, but the vast majority are somatically acquired during the transition of a normal cell to a cancer cell. A systematic understanding of the genetic and molecular determinants of cancers has already begun to have a transformative effect on the study and treatment of cancer, particularly through the identification of a range of genetic alterations in protein kinase genes, which are highly associated with the disease. Since kinases are prominent therapeutic targets for intervention within the cancer cell, studying the impact that genomic alterations within them have on cancer initiation, progression, and treatment is both logical and timely. In fact, recent sequencing and resequencing (i.e., polymorphism idenitification) efforts have catalyzed the quest for protein kinase ‘driver’ mutations (i.e., those genetic alterations which contribute to the transformation of a normal cell to a proliferating cancerous cell) in distinction to kinase ‘passenger’ mutations which reflect mutations that merely build up in course of normal and unchecked (i.e., cancerous) somatic cell replication and proliferation. In this review, we discuss the recent progress in the discovery and functional characterization of protein kinase cancer driver mutations and the implications of this progress for understanding tumorigenesis as well as the design of ‘personalized’ cancer therapeutics that target an individual’s unique mutational profile.
doi:10.1016/j.canlet.2008.11.008
PMCID: PMC2905872
PMID: 19081671
There has been growing debate over the nature of the genetic contribution to individual susceptibility to common complex diseases such as diabetes, osteoporosis, and cancer. The ‘Common Disease, Common Variant (CDCV)’ hypothesis argues that genetic variations with appreciable frequency in the population at large, but relatively low ‘penetrance’ (or the probability that a carrier of the relevant variants will express the disease), are the major contributors to genetic susceptibility to common diseases. The ‘Common Disease, Rare Variant (CDRV)’ hypothesis, on the other hand, argues that multiple rare DNA sequence variations, each with relatively high penetrance, are the major contributors to genetic susceptibility to common diseases. Both hypotheses have their place in current research efforts.
doi:10.1016/j.gde.2009.04.010
PMCID: PMC2914559
PMID: 19481926
Background
Malaria caused by Plasmodium vivax is an experimentally neglected severe disease with a substantial burden on human health. Because of technical limitations, little is known about the biology of this important human pathogen. Whole genome analysis methods on patient-derived material are thus likely to have a substantial impact on our understanding of P. vivax pathogenesis and epidemiology. For example, it will allow study of the evolution and population biology of the parasite, allow parasite transmission patterns to be characterized, and may facilitate the identification of new drug resistance genes. Because parasitemias are typically low and the parasite cannot be readily cultured, on-site leukocyte depletion of blood samples is typically needed to remove human DNA that may be 1000X more abundant than parasite DNA. These features have precluded the analysis of archived blood samples and require the presence of laboratories in close proximity to the collection of field samples for optimal pre-cryopreservation sample preparation.
Results
Here we show that in-solution hybridization capture can be used to extract P. vivax DNA from human contaminating DNA in the laboratory without the need for on-site leukocyte filtration. Using a whole genome capture method, we were able to enrich P. vivax DNA from bulk genomic DNA from less than 0.5% to a median of 55% (range 20%-80%). This level of enrichment allows for efficient analysis of the samples by whole genome sequencing and does not introduce any gross biases into the data. With this method, we obtained greater than 5X coverage across 93% of the P. vivax genome for four P. vivax strains from Iquitos, Peru, which is similar to our results using leukocyte filtration (greater than 5X coverage across 96% ).
Conclusion
The whole genome capture technique will enable more efficient whole genome analysis of P. vivax from a larger geographic region and from valuable archived sample collections.
doi:10.1186/1471-2164-13-262
PMCID: PMC3410760
PMID: 22721170
Malaria
Recent genome wide association studies (GWAS) have identified DNA sequence variations that exhibit unequivocal statistical associations with many common chronic diseases. However, the vast majority of these studies identified variations that explain only a very small fraction of disease burden in the population at large, suggesting that other factors, such as multiple rare or low-penetrance variations and interacting environmental factors, are major contributors to disease susceptibility. Identifying multiple low penetrance variations (or ‘polygenes’) contributing to disease susceptibility will be difficult. We present a pathway analysis approach to characterizing the likely polygenic basis of seven common diseases using the Wellcome Trust Case Control Consortium (WTCCC) GWAS results. We identify numerous pathways implicated in disease predisposition that would have not been revealed using standard single-locus GWAS statistical analysis criteria. Many of these pathways have long been assumed to contain polymorphic genes that lead to disease predisposition. Additionally, we analyze the genetic relationships between the seven diseases, and based upon similarities with respect to the associated genes and pathways affected in each, propose a new way of categorizing the diseases.
doi:10.1016/j.ygeno.2008.07.011
PMCID: PMC2602835
PMID: 18722519
Pathway; genome-wide; disease; common; diabetes; crohn’s; coronary; bipolar; arthritis; hypertension
Root, Tammy L. | Szatkiewicz, Jin P. | Jonassaint, Charles R. | Thornton, Laura M. | Pinheiro, Andrea Poyastro | Strober, Michael | Bloss, Cinnamon | Berrettini, Wade | Schork, Nicholas J. | Kaye, Walter H. | Bergen, Andrew W. | Magistretti, Pierre | Brandt, Harry | Crawford, Steve | Crow, Scott | Fichter, Manfred M. | Goldman, David | Halmi, Katherine A. | Johnson, Craig | Kaplan, Allan S. | Keel, Pamela K. | Klump, Kelly L. | La Via, Maria | Mitchell, James E. | Rotondo, Alessandro | Treasure, Janet | Woodside, D. Blake | Bulik, Cynthia M.
This analysis is a follow-up to an earlier investigation of 182 genes selected as likely candidate genetic variations conferring susceptibility to anorexia nervosa (AN). As those initial case-control results revealed no statistically significant differences in single nucleotide polymorphisms, herein we investigate alternative phenotypes associated with AN. In 1762 females using regression analyses we examined: (1) lowest illness-related attained body mass index; (2) age at menarche; (3) drive for thinness; (4) body dissatisfaction; (5) trait anxiety; (6) concern over mistakes; and (7) the anticipatory worry and pessimism vs. uninhibited optimism subscale of the harm avoidance scale. After controlling for multiple comparisons, no statistically significant results emerged. Although results must be viewed in the context of limitations of statistical power, the approach illustrates a means of potentially identifying genetic variants conferring susceptibility to AN because less complex phenotypes associated with AN are more proximal to the genotype and may be influenced by fewer genes.
doi:10.1002/erv.1138
PMCID: PMC3261131
PMID: 21780254
covariates; eating disorders; association studies; personality; genetic
Recently, Wu and colleagues [1] proposed two novel statistics for genome-wide interaction analysis using case/control or case-only data. In computer simulations, their proposed case/control statistic outperformed competing approaches, including the fast-epistasis option in PLINK and logistic regression analysis under the correct model; however, reasons for its superior performance were not fully explored. Here we investigate the theoretical properties and performance of Wu et al.'s proposed statistics and explain why, in some circumstances, they outperform competing approaches. Unfortunately, we find minor errors in the formulae for their statistics, resulting in tests that have higher than nominal type 1 error. We also find minor errors in PLINK's fast-epistasis and case-only statistics, although theory and simulations suggest that these errors have only negligible effect on type 1 error. We propose adjusted versions of all four statistics that, both theoretically and in computer simulations, maintain correct type 1 error rates under the null hypothesis. We also investigate statistics based on correlation coefficients that maintain similar control of type 1 error. Although designed to test specifically for interaction, we show that some of these previously-proposed statistics can, in fact, be sensitive to main effects at one or both loci, particularly in the presence of linkage disequilibrium. We propose two new “joint effects” statistics that, provided the disease is rare, are sensitive only to genuine interaction effects. In computer simulations we find, in most situations considered, that highest power is achieved by analysis under the correct genetic model. Such an analysis is unachievable in practice, as we do not know this model. However, generally high power over a wide range of scenarios is exhibited by our joint effects and adjusted Wu statistics. We recommend use of these alternative or adjusted statistics and urge caution when using Wu et al.'s originally-proposed statistics, on account of the inflated error rate that can result.
Author Summary
Gene–gene interactions are a topic of great interest to geneticists carrying out studies of how genetic factors influence the development of common, complex diseases. Genes that interact may not only make important biological contributions to underlying disease processes, but also be more difficult to detect when using standard statistical methods in which we examine the effects of genetic factors one at a time. Recently a method was proposed by Wu and colleagues [1] for detecting pairwise interactions when carrying out genome-wide association studies (in which a large number of genetic variants across the genome are examined). Wu and colleagues carried out theoretical work and computer simulations that suggested their method outperformed other previously proposed approaches for detecting interactions. Here we show that, in fact, the method proposed by Wu and colleagues can result in an over-preponderence of false postive findings. We propose an adjusted version of their method that reduces the false positive rate while maintaining high power. We also propose a new method for detecting pairs of genetic effects that shows similarly high power but has some conceptual advantages over both Wu's method and also other previously proposed approaches.
doi:10.1371/journal.pgen.1002625
PMCID: PMC3320596
PMID: 22496670
Human aging is a complex, multifactorial process influenced by a number of genetic and non-genetic factors. This article first reviews genetic strategies for human aging research and considers the advantages and disadvantages of each. We then discuss the issue of phenotypic definition for genetic studies of aging, including longevity/life span, as well as disease-free survival and other endophenotypes. Finally, we argue that extensions of this area of research, including incorporation of gene × environment interactions, multivariate phenotypes, integration of functional genomic annotations, and exploitation of orthology – many of which are already initiated and ongoing – are critical to advancing this field.
doi:10.1016/j.arr.2010.07.005
PMCID: PMC3043164
PMID: 20709627
The enormous advances in genetics and genomics of the past decade have the potential to revolutionize health care, including mental health care, and bring about a system predominantly characterized by the practice of genomic and personalized medicine. We briefly review the history of genetics and genomics and present heritability estimates for major chronic diseases of aging and neuropsychiatric disorders. We then assess the extent to which the results of genetic and genomic studies are currently being leveraged clinically for disease treatment and prevention and identify priority research areas in which further work is needed. Pharmacogenomics has emerged as one area of genomics that already has had notable impacts on disease treatment and the practice of medicine. Little evidence, however, for the clinical validity and utility of predictive testing based on genomic information is available, and thus has, to some extent, hindered broader-scale preventive efforts for common, complex diseases. Furthermore, although other disease areas have had greater success in identifying genetic factors responsible for various conditions, progress in identifying the genetic basis of neuropsychiatric diseases has lagged behind. We review social, economic, and policy issues relevant to genomic medicine, and find that a new model of health care based on proactive and preventive health planning and individualized treatment will require major advances in health care policy and administration. Specifically, incentives for relevant stakeholders are critical, as are realignment of incentives and education initiatives for physicians, and updates to pertinent legislation. Moreover, the translational behavioral and public health research necessary for fully integrating genomics into health care is lacking, and further work in these areas is needed. In short, while the pace of advances in genetic and genomic science and technology has been rapid, more work is needed to fully realize the potential for impacting disease treatment and prevention generally, and mental health specifically.
doi:10.1016/j.psc.2010.11.005
PMCID: PMC3073546
PMID: 21333845
genomics; genetic testing; genetic risk assessment; public health genomics; pharmacogenomics
Available statistical preprocessing or quality control analysis tools for gene expression microarray datasets are known to greatly affect downstream data analysis, especially when degraded samples, unique tissue samples, or novel expression assays are used. It is therefore important to assess the validity and impact of the assumptions built in to preprocessing schemes for a dataset. We developed and assessed a data preprocessing strategy for use with the Illumina DASL-based gene expression assay with partially degraded postmortem prefrontal cortex samples. The samples were obtained from individuals with autism as part of an investigation of the pathogenic factors contributing to autism. Using statistical analysis methods and metrics such as those associated with multivariate distance matrix regression and mean inter-array correlation, we developed a DASL-based assay gene expression preprocessing pipeline to accommodate and detect problems with microarray-based gene expression values obtained with degraded brain samples. Key steps in the pipeline included outlier exclusion, data transformation and normalization, and batch effect and covariate corrections. Our goal was to produce a clean dataset for subsequent downstream differential expression analysis. We ultimately settled on available transformation and normalization algorithms in the R/Bioconductor package lumi based on an assessment of their use in various combinations. A log2-transformed, quantile-normalized, and batch and seizure-corrected procedure was likely the most appropriate for our data. We empirically tested different components of our proposed preprocessing strategy and believe that our results suggest that a preprocessing strategy that effectively identifies outliers, normalizes the data, and corrects for batch effects can be applied to all studies, even those pursued with degraded samples.
doi:10.3389/fgene.2012.00011
PMCID: PMC3286152
PMID: 22375143
gene expression; microarray; data preprocessing; quality control
Epigenetic information, which may affect an organisms’ phenotype, can be stored and stably inherited in the form of cytosine DNA methylation. Changes in DNA methylation can produce meiotically stable epialleles that affect transcription and morphology, but the rates of spontaneous gain or loss of DNA methylation are unknown. We examined spontaneously occurring variation in DNA methylation in Arabidopsis thaliana plants propagated by single-seed descent for 30 generations. 114,287 CG single methylation polymorphisms (SMPs) and 2485 CG differentially methylated regions (DMRs) were identified, both of which show patterns of divergence compared to the ancestral state. Thus, transgenerational epigenetic variation in DNA methylation may generate new allelic states that alter transcription providing a mechanism for phenotypic diversity in the absence of genetic mutation.
doi:10.1126/science.1212959
PMCID: PMC3210014
PMID: 21921155
Sebastiani, Paola | Riva, Alberto | Montano, Monty | Pham, Phillip | Torkamani, Ali | Scherba, Eugene | Benson, Gary | Milton, Jacqueline N. | Baldwin, Clinton T. | Andersen, Stacy | Schork, Nicholas J. | Steinberg, Martin H. | Perls, Thomas T.
Supercentenarians (age 110+ years old) generally delay or escape age-related diseases and disability well beyond the age of 100 and this exceptional survival is likely to be influenced by a genetic predisposition that includes both common and rare genetic variants. In this report, we describe the complete genomic sequences of male and female supercentenarians, both age >114 years old. We show that: (1) the sequence variant spectrum of these two individuals’ DNA sequences is largely comparable to existing non-supercentenarian genomes; (2) the two individuals do not appear to carry most of the well-established human longevity enabling variants already reported in the literature; (3) they have a comparable number of known disease-associated variants relative to most human genomes sequenced to-date; (4) approximately 1% of the variants these individuals possess are novel and may point to new genes involved in exceptional longevity; and (5) both individuals are enriched for coding variants near longevity-associated variants that we discovered through a large genome-wide association study. These analyses suggest that there are both common and rare longevity-associated variants that may counter the effects of disease-predisposing variants and extend lifespan. The continued analysis of the genomes of these and other rare individuals who have survived to extremely old ages should provide insight into the processes that contribute to the maintenance of health during extreme aging.
doi:10.3389/fgene.2011.00090
PMCID: PMC3262222
PMID: 22303384
whole genome sequence; genetics; longevity; centenarian; supercentenarian; aging
N-of-1 or single subject clinical trials consider an individual patient as the sole unit of observation in a study investigating the efficacy or side-effect profiles of different interventions. The ultimate goal of an n-of-1 trial is to determine the optimal or best intervention for an individual patient using objective data-driven criteria. Such trials can leverage study design and statistical techniques associated with standard population-based clinical trials, including randomization, washout and crossover periods, as well as placebo controls. Despite their obvious appeal and wide use in educational settings, n-of-1 trials have been used sparingly in medical and general clinical settings. We briefly review the history, motivation and design of n-of-1 trials and emphasize the great utility of modern wireless medical monitoring devices in their execution. We ultimately argue that n-of-1 trials demand serious attention among the health research and clinical care communities given the contemporary focus on individualized medicine.
doi:10.2217/pme.11.7
PMCID: PMC3118090
PMID: 21695041
clinical equipoise; early-phase trials; individualized medicine; n-of-1; remote phenotyping; single patient trial; treatment repositioning; wireless health
Baker, Dewleen G. | Nash, William P. | Litz, Brett T. | Geyer, Mark A. | Risbrough, Victoria B. | Nievergelt, Caroline M. | O’Connor, Daniel T. | Larson, Gerald E. | Schork, Nicholas J. | Vasterling, Jennifer J. | Hammer, Paul S. | Webb-Murphy, Jennifer A.
The Marine Resiliency Study (MRS) is a prospective study of factors predictive of posttraumatic stress disorder (PTSD) among approximately 2,600 Marines in 4 battalions deployed to Iraq or Afghanistan. We describe the MRS design and predeployment participant characteristics. Starting in 2008, our research team conducted structured clinical interviews on Marine bases and collected data 4 times: at predeployment and at 1 week, 3 months, and 6 months postdeployment. Integrated with these data are medical and career histories from the Career History Archival Medical and Personnel System (CHAMPS) database. The CHAMPS database showed that 7.4% of the Marines enrolled in MRS had at least 1 mental health diagnosis. Of enrolled Marines, approximately half (51.3%) had prior deployments. We found a moderate positive relationship between deployment history and PTSD prevalence in these baseline data.
doi:10.5888/pcd9.110134
PMCID: PMC3431952
PMID: 22575082